AMD AOCC 3.2 Compiler Benchmarks AMD EPYC 72F3 of AOCC 3.2 compiler and prior releases. Benchmarks by Michael Larabel for a future article.  AMD AOCC 3.2: Processor: AMD EPYC 72F3 8-Core @ 3.70GHz (8 Cores / 16 Threads), Motherboard: Supermicro H12SSL-i v1.01 (2.0 BIOS), Chipset: AMD Starship/Matisse, Memory: 126GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP + 1000GB Corsair Force MP600, Graphics: ASPEED, Monitor: VE228, Network: 2 x Broadcom NetXtreme BCM5720 2-port PCIe OS: Ubuntu 21.04, Kernel: 5.14.0-rc7-amd-pstate-phx (x86_64) 20210909, Desktop: GNOME Shell 3.38.4, Display Server: X Server, Compiler: Clang 13.0.0, File-System: ext4, Screen Resolution: 1920x1080 AMD AOCC 3.1: Processor: AMD EPYC 72F3 8-Core @ 3.70GHz (8 Cores / 16 Threads), Motherboard: Supermicro H12SSL-i v1.01 (2.0 BIOS), Chipset: AMD Starship/Matisse, Memory: 126GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP + 1000GB Corsair Force MP600, Graphics: ASPEED, Monitor: VE228, Network: 2 x Broadcom NetXtreme BCM5720 2-port PCIe OS: Ubuntu 21.04, Kernel: 5.14.0-rc7-amd-pstate-phx (x86_64) 20210909, Desktop: GNOME Shell 3.38.4, Display Server: X Server, Compiler: Clang 12.0.0, File-System: ext4, Screen Resolution: 1920x1080 AMD AOCC 3.0: Processor: AMD EPYC 72F3 8-Core @ 3.70GHz (8 Cores / 16 Threads), Motherboard: Supermicro H12SSL-i v1.01 (2.0 BIOS), Chipset: AMD Starship/Matisse, Memory: 126GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP + 1000GB Corsair Force MP600, Graphics: ASPEED, Monitor: VE228, Network: 2 x Broadcom NetXtreme BCM5720 2-port PCIe OS: Ubuntu 21.04, Kernel: 5.14.0-rc7-amd-pstate-phx (x86_64) 20210909, Desktop: GNOME Shell 3.38.4, Display Server: X Server, Compiler: Clang 12.0.0, File-System: ext4, Screen Resolution: 1920x1080 Apache HTTP Server 2.4.48 Concurrent Requests: 1 Requests Per Second > Higher Is Better AMD AOCC 3.2 . 6329.40 |======================================================= AMD AOCC 3.1 . 6015.71 |==================================================== AMD AOCC 3.0 . 6241.24 |====================================================== Apache HTTP Server 2.4.48 Concurrent Requests: 200 Requests Per Second > Higher Is Better AMD AOCC 3.2 . 89516.52 |====================================================== AMD AOCC 3.1 . 85820.09 |==================================================== AMD AOCC 3.0 . 87051.70 |===================================================== Apache HTTP Server 2.4.48 Concurrent Requests: 500 Requests Per Second > Higher Is Better AMD AOCC 3.2 . 84374.51 |====================================================== AMD AOCC 3.1 . 82157.17 |===================================================== AMD AOCC 3.0 . 82641.76 |===================================================== Basis Universal 1.13 Settings: UASTC Level 0 Seconds < Lower Is Better AMD AOCC 3.2 . 6.768 |======================================================== AMD AOCC 3.1 . 6.774 |======================================================== AMD AOCC 3.0 . 6.898 |========================================================= Basis Universal 1.13 Settings: UASTC Level 2 Seconds < Lower Is Better AMD AOCC 3.2 . 27.53 |========================================================= AMD AOCC 3.1 . 27.60 |========================================================= AMD AOCC 3.0 . 27.76 |========================================================= Basis Universal 1.13 Settings: UASTC Level 3 Seconds < Lower Is Better AMD AOCC 3.2 . 51.15 |========================================================= AMD AOCC 3.1 . 51.40 |========================================================= AMD AOCC 3.0 . 51.52 |========================================================= Botan 2.17.3 Test: KASUMI MiB/s > Higher Is Better AMD AOCC 3.2 . 98.00 |======================================================== AMD AOCC 3.1 . 98.87 |========================================================= AMD AOCC 3.0 . 96.47 |======================================================== Botan 2.17.3 Test: Twofish MiB/s > Higher Is Better AMD AOCC 3.2 . 368.11 |======================================================== AMD AOCC 3.1 . 368.27 |======================================================== AMD AOCC 3.0 . 356.96 |====================================================== Botan 2.17.3 Test: Twofish - Decrypt MiB/s > Higher Is Better AMD AOCC 3.2 . 397.06 |======================================================== AMD AOCC 3.1 . 376.20 |===================================================== AMD AOCC 3.0 . 355.70 |================================================== Botan 2.17.3 Test: CAST-256 MiB/s > Higher Is Better AMD AOCC 3.2 . 157.58 |======================================================== AMD AOCC 3.1 . 148.84 |===================================================== AMD AOCC 3.0 . 149.82 |===================================================== Botan 2.17.3 Test: CAST-256 - Decrypt MiB/s > Higher Is Better AMD AOCC 3.2 . 154.61 |======================================================== AMD AOCC 3.1 . 152.68 |======================================================= AMD AOCC 3.0 . 150.03 |====================================================== Chia Blockchain VDF 1.0.1 Test: Square Plain C++ IPS > Higher Is Better AMD AOCC 3.2 . 186667 |======================================================== AMD AOCC 3.1 . 186467 |======================================================== AMD AOCC 3.0 . 185000 |======================================================= Chia Blockchain VDF 1.0.1 Test: Square Assembly Optimized IPS > Higher Is Better AMD AOCC 3.2 . 127313 |======================================================== AMD AOCC 3.1 . 124687 |======================================================= AMD AOCC 3.0 . 123907 |======================================================= Coremark 1.0 CoreMark Size 666 - Iterations Per Second Iterations/Sec > Higher Is Better AMD AOCC 3.2 . 358684.33 |===================================================== AMD AOCC 3.1 . 348059.95 |=================================================== AMD AOCC 3.0 . 348647.22 |==================================================== CppPerformanceBenchmarks 9 Test: Atol Seconds < Lower Is Better AMD AOCC 3.2 . 43.76 |========================================================= AMD AOCC 3.1 . 43.50 |========================================================= AMD AOCC 3.0 . 43.33 |======================================================== dav1d 0.9.2 Video Input: Chimera 1080p FPS > Higher Is Better AMD AOCC 3.2 . 538.84 |======================================================== AMD AOCC 3.1 . 537.39 |======================================================== AMD AOCC 3.0 . 531.81 |======================================================= dav1d 0.9.2 Video Input: Summer Nature 1080p FPS > Higher Is Better AMD AOCC 3.2 . 504.53 |======================================================== AMD AOCC 3.1 . 504.09 |======================================================== AMD AOCC 3.0 . 497.87 |======================================================= Etcpak 0.7 Configuration: ETC2 Mpx/s > Higher Is Better AMD AOCC 3.2 . 231.49 |======================================================== AMD AOCC 3.1 . 229.49 |======================================================== AMD AOCC 3.0 . 208.99 |=================================================== FLAC Audio Encoding 1.3.3 WAV To FLAC Seconds < Lower Is Better AMD AOCC 3.2 . 15.87 |========================================================= AMD AOCC 3.1 . 15.85 |========================================================= AMD AOCC 3.0 . 15.92 |========================================================= GraphicsMagick 1.3.33 Operation: Sharpen Iterations Per Minute > Higher Is Better AMD AOCC 3.2 . 123 |=========================================================== AMD AOCC 3.1 . 113 |====================================================== AMD AOCC 3.0 . 116 |======================================================== JPEG XL Decoding libjxl 0.6.1 CPU Threads: 1 MP/s > Higher Is Better AMD AOCC 3.2 . 66.73 |========================================================= AMD AOCC 3.1 . 66.92 |========================================================= AMD AOCC 3.0 . 65.99 |======================================================== JPEG XL libjxl 0.6.1 Input: JPEG - Encode Speed: 8 MP/s > Higher Is Better AMD AOCC 3.2 . 28.69 |========================================================= AMD AOCC 3.1 . 28.61 |========================================================= AMD AOCC 3.0 . 28.47 |========================================================= KTX-Software toktx 4.0 Settings: Zstd Compression 19 Seconds < Lower Is Better AMD AOCC 3.2 . 17.56 |======================================================== AMD AOCC 3.1 . 17.49 |======================================================== AMD AOCC 3.0 . 17.78 |========================================================= KTX-Software toktx 4.0 Settings: UASTC 3 + Zstd Compression 19 Seconds < Lower Is Better AMD AOCC 3.2 . 16.57 |========================================================= AMD AOCC 3.1 . 16.58 |========================================================= AMD AOCC 3.0 . 16.59 |========================================================= Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Medium Frames Per Second > Higher Is Better AMD AOCC 3.2 . 8.37 |========================================================== AMD AOCC 3.1 . 8.26 |========================================================= Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Very Fast Frames Per Second > Higher Is Better AMD AOCC 3.2 . 19.17 |========================================================= AMD AOCC 3.1 . 19.13 |========================================================= Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Ultra Fast Frames Per Second > Higher Is Better AMD AOCC 3.2 . 32.26 |========================================================= AMD AOCC 3.1 . 32.15 |========================================================= LeelaChessZero 0.28 Backend: BLAS Nodes Per Second > Higher Is Better AMD AOCC 3.2 . 1980 |========================================================== AMD AOCC 3.1 . 1909 |======================================================== AMD AOCC 3.0 . 1971 |========================================================== LeelaChessZero 0.28 Backend: Eigen Nodes Per Second > Higher Is Better AMD AOCC 3.2 . 2085 |======================================================== AMD AOCC 3.1 . 2150 |========================================================== AMD AOCC 3.0 . 1995 |====================================================== libjpeg-turbo tjbench 2.1.0 Test: Decompression Throughput Megapixels/sec > Higher Is Better AMD AOCC 3.2 . 244.12 |======================================================== AMD AOCC 3.1 . 244.49 |======================================================== AMD AOCC 3.0 . 233.05 |===================================================== Liquid-DSP 2021.01.31 Threads: 16 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better AMD AOCC 3.2 . 695340000 |===================================================== AMD AOCC 3.1 . 636810000 |================================================= AMD AOCC 3.0 . 689073333 |===================================================== LZ4 Compression 1.9.3 Compression Level: 1 - Compression Speed MB/s > Higher Is Better AMD AOCC 3.2 . 13906.92 |====================================================== AMD AOCC 3.1 . 13770.38 |===================================================== AMD AOCC 3.0 . 13663.52 |===================================================== NCNN 20210720 Target: CPU - Model: mobilenet ms < Lower Is Better AMD AOCC 3.2 . 12.54 |===================================================== AMD AOCC 3.1 . 13.29 |======================================================== AMD AOCC 3.0 . 13.47 |========================================================= NCNN 20210720 Target: CPU-v2-v2 - Model: mobilenet-v2 ms < Lower Is Better AMD AOCC 3.2 . 4.26 |====================================================== AMD AOCC 3.1 . 4.48 |========================================================= AMD AOCC 3.0 . 4.55 |========================================================== NCNN 20210720 Target: CPU-v3-v3 - Model: mobilenet-v3 ms < Lower Is Better AMD AOCC 3.2 . 3.79 |======================================================= AMD AOCC 3.1 . 3.98 |========================================================= AMD AOCC 3.0 . 4.02 |========================================================== NCNN 20210720 Target: CPU - Model: shufflenet-v2 ms < Lower Is Better AMD AOCC 3.2 . 4.94 |========================================================= AMD AOCC 3.1 . 5.05 |========================================================== AMD AOCC 3.0 . 5.04 |========================================================== NCNN 20210720 Target: CPU - Model: mnasnet ms < Lower Is Better AMD AOCC 3.2 . 3.86 |======================================================== AMD AOCC 3.1 . 3.95 |========================================================= AMD AOCC 3.0 . 4.01 |========================================================== NCNN 20210720 Target: CPU - Model: efficientnet-b0 ms < Lower Is Better AMD AOCC 3.2 . 5.55 |======================================================== AMD AOCC 3.1 . 5.70 |========================================================= AMD AOCC 3.0 . 5.79 |========================================================== NCNN 20210720 Target: CPU - Model: resnet50 ms < Lower Is Better AMD AOCC 3.2 . 16.71 |======================================================= AMD AOCC 3.1 . 17.07 |======================================================== AMD AOCC 3.0 . 17.31 |========================================================= NCNN 20210720 Target: CPU - Model: yolov4-tiny ms < Lower Is Better AMD AOCC 3.2 . 21.46 |======================================================== AMD AOCC 3.1 . 21.40 |======================================================== AMD AOCC 3.0 . 21.94 |========================================================= NCNN 20210720 Target: CPU - Model: squeezenet_ssd ms < Lower Is Better AMD AOCC 3.2 . 18.27 |======================================================== AMD AOCC 3.1 . 18.53 |========================================================= AMD AOCC 3.0 . 18.44 |========================================================= Ngspice 34 Circuit: C2670 Seconds < Lower Is Better AMD AOCC 3.2 . 88.42 |======================================================= AMD AOCC 3.1 . 89.12 |======================================================== AMD AOCC 3.0 . 91.50 |========================================================= oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU ms < Lower Is Better AMD AOCC 3.2 . 1.90323 |================================================= AMD AOCC 3.1 . 1.96347 |=================================================== AMD AOCC 3.0 . 2.11471 |======================================================= oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU ms < Lower Is Better AMD AOCC 3.2 . 3.87713 |===================================================== AMD AOCC 3.1 . 3.88522 |===================================================== AMD AOCC 3.0 . 3.99675 |======================================================= ONNX Runtime 1.10 Model: shufflenet-v2-10 - Device: CPU Inferences Per Minute > Higher Is Better AMD AOCC 3.2 . 21250 |========================================================= AMD AOCC 3.1 . 19634 |===================================================== AMD AOCC 3.0 . 19886 |===================================================== Primesieve 7.7 1e12 Prime Number Generation Seconds < Lower Is Better AMD AOCC 3.2 . 20.95 |========================================================= AMD AOCC 3.1 . 20.99 |========================================================= AMD AOCC 3.0 . 21.08 |========================================================= QuantLib 1.21 MFLOPS > Higher Is Better AMD AOCC 3.2 . 3208.6 |======================================================== AMD AOCC 3.1 . 3151.1 |======================================================= AMD AOCC 3.0 . 3159.8 |======================================================= RNNoise 2020-06-28 Seconds < Lower Is Better AMD AOCC 3.2 . 16.84 |========================================================= AMD AOCC 3.1 . 16.94 |========================================================= AMD AOCC 3.0 . 16.88 |========================================================= Stockfish 13 Total Time Nodes Per Second > Higher Is Better AMD AOCC 3.2 . 25833642 |====================================================== AMD AOCC 3.1 . 25929053 |====================================================== AMD AOCC 3.0 . 25435242 |===================================================== SVT-HEVC 1.5.0 Tuning: 1 - Input: Bosphorus 1080p Frames Per Second > Higher Is Better AMD AOCC 3.2 . 9.17 |========================================================== AMD AOCC 3.1 . 9.11 |========================================================== AMD AOCC 3.0 . 9.04 |========================================================= SVT-HEVC 1.5.0 Tuning: 7 - Input: Bosphorus 1080p Frames Per Second > Higher Is Better AMD AOCC 3.2 . 117.07 |======================================================== AMD AOCC 3.1 . 116.46 |======================================================== AMD AOCC 3.0 . 115.76 |======================================================= SVT-HEVC 1.5.0 Tuning: 10 - Input: Bosphorus 1080p Frames Per Second > Higher Is Better AMD AOCC 3.2 . 231.07 |======================================================== AMD AOCC 3.1 . 229.69 |======================================================== AMD AOCC 3.0 . 228.54 |======================================================= SVT-VP9 0.3 Tuning: VMAF Optimized - Input: Bosphorus 1080p Frames Per Second > Higher Is Better AMD AOCC 3.2 . 151.30 |======================================================== AMD AOCC 3.1 . 149.85 |======================================================= AMD AOCC 3.0 . 150.26 |======================================================== WebP Image Encode 1.1 Encode Settings: Quality 100, Highest Compression Encode Time - Seconds < Lower Is Better AMD AOCC 3.2 . 5.548 |======================================================== AMD AOCC 3.1 . 5.567 |========================================================= AMD AOCC 3.0 . 5.612 |========================================================= Zstd Compression 1.5.0 Compression Level: 3 - Compression Speed MB/s > Higher Is Better AMD AOCC 3.2 . 3251.9 |======================================================= AMD AOCC 3.1 . 3284.5 |======================================================== AMD AOCC 3.0 . 3101.4 |===================================================== Zstd Compression 1.5.0 Compression Level: 3 - Decompression Speed MB/s > Higher Is Better AMD AOCC 3.2 . 4009.5 |======================================================== AMD AOCC 3.1 . 3907.9 |======================================================= AMD AOCC 3.0 . 3946.6 |======================================================= Zstd Compression 1.5.0 Compression Level: 8 - Compression Speed MB/s > Higher Is Better AMD AOCC 3.2 . 1396.2 |======================================================== AMD AOCC 3.1 . 1367.3 |======================================================= AMD AOCC 3.0 . 1370.8 |======================================================= Zstd Compression 1.5.0 Compression Level: 8 - Decompression Speed MB/s > Higher Is Better AMD AOCC 3.2 . 4138.3 |======================================================== AMD AOCC 3.1 . 4019.9 |====================================================== AMD AOCC 3.0 . 4140.6 |======================================================== Zstd Compression 1.5.0 Compression Level: 19 - Compression Speed MB/s > Higher Is Better AMD AOCC 3.2 . 51.7 |========================================================== AMD AOCC 3.1 . 51.8 |========================================================== AMD AOCC 3.0 . 52.0 |========================================================== Zstd Compression 1.5.0 Compression Level: 19 - Decompression Speed MB/s > Higher Is Better AMD AOCC 3.2 . 3717.6 |======================================================== AMD AOCC 3.1 . 3602.1 |====================================================== AMD AOCC 3.0 . 3674.6 |======================================================= Zstd Compression 1.5.0 Compression Level: 3, Long Mode - Compression Speed MB/s > Higher Is Better AMD AOCC 3.2 . 443.2 |========================================================= AMD AOCC 3.1 . 442.3 |========================================================= AMD AOCC 3.0 . 444.7 |========================================================= Zstd Compression 1.5.0 Compression Level: 3, Long Mode - Decompression Speed MB/s > Higher Is Better AMD AOCC 3.2 . 4392.4 |======================================================== AMD AOCC 3.1 . 4200.0 |====================================================== AMD AOCC 3.0 . 4309.5 |======================================================= Geometric Mean Of All Test Results Result Composite - AMD AOCC 3.2 Compiler Benchmarks Geometric Mean > Higher Is Better AMD AOCC 3.2 . 287.33 |======================================================== AMD AOCC 3.1 . 282.22 |======================================================= AMD AOCC 3.0 . 279.98 |=======================================================