AMD AOCC 3.1 Compiler Comparison AMD EPYC 7543 testing of AMD AOCC 3.1 compiler benchmarks by Michael Larabel for a future article. AOCC 3.1: Processor: AMD EPYC 7543 32-Core @ 2.80GHz (32 Cores / 64 Threads), Motherboard: TYAN S8036GM2NE-LE (V2.00.B21 BIOS), Chipset: AMD Starship/Matisse, Memory: 64GB, Disk: 1000GB Corsair Force MP600, Graphics: ASPEED, Monitor: VE228, Network: 2 x Broadcom NetXtreme BCM5720 2-port PCIe OS: Ubuntu 21.04, Kernel: 5.11.0-25-generic (x86_64), Desktop: GNOME Shell 3.38.4, Display Server: X Server, Compiler: Clang 12.0.0, File-System: ext4, Screen Resolution: 1920x1080 Clang 12.0: Processor: AMD EPYC 7543 32-Core @ 2.80GHz (32 Cores / 64 Threads), Motherboard: TYAN S8036GM2NE-LE (V2.00.B21 BIOS), Chipset: AMD Starship/Matisse, Memory: 64GB, Disk: 1000GB Corsair Force MP600, Graphics: ASPEED, Monitor: VE228, Network: 2 x Broadcom NetXtreme BCM5720 2-port PCIe OS: Ubuntu 21.04, Kernel: 5.11.0-25-generic (x86_64), Desktop: GNOME Shell 3.38.4, Display Server: X Server, Compiler: Clang 12.0.1-++20210630032617+fed41342a82f-1~exp1~20210630133328.128, File-System: ext4, Screen Resolution: 1920x1080 GCC 11.1: Processor: AMD EPYC 7543 32-Core @ 2.80GHz (32 Cores / 64 Threads), Motherboard: TYAN S8036GM2NE-LE (V2.00.B21 BIOS), Chipset: AMD Starship/Matisse, Memory: 64GB, Disk: 1000GB Corsair Force MP600, Graphics: ASPEED, Monitor: VE228, Network: 2 x Broadcom NetXtreme BCM5720 2-port PCIe OS: Ubuntu 21.04, Kernel: 5.11.0-25-generic (x86_64), Desktop: GNOME Shell 3.38.4, Display Server: X Server, Compiler: GCC 11.1.0, File-System: ext4, Screen Resolution: 1920x1080 C-Blosc 2.0 Compressor: blosclz MB/s > Higher Is Better AOCC 3.1 ... 25069.6 |======================================================== Clang 12.0 . 25417.8 |========================================================= GCC 11.1 ... 25536.1 |========================================================= QuantLib 1.21 MFLOPS > Higher Is Better AOCC 3.1 ... 2854.9 |========================================================== Clang 12.0 . 2838.6 |========================================================== GCC 11.1 ... 2861.2 |========================================================== Etcpak 0.7 Configuration: DXT1 Mpx/s > Higher Is Better AOCC 3.1 ... 2897.28 |========================================================= Clang 12.0 . 2862.40 |======================================================== GCC 11.1 ... 1170.47 |======================= Etcpak 0.7 Configuration: ETC2 Mpx/s > Higher Is Better AOCC 3.1 ... 207.55 |======================================================== Clang 12.0 . 214.17 |========================================================== GCC 11.1 ... 183.80 |================================================== LZ4 Compression 1.9.3 Compression Level: 1 - Compression Speed MB/s > Higher Is Better AOCC 3.1 ... 10964.11 |=================================================== Clang 12.0 . 10375.18 |================================================ GCC 11.1 ... 12074.05 |======================================================== Zstd Compression 1.5.0 Compression Level: 8 - Compression Speed MB/s > Higher Is Better AOCC 3.1 ... 2852.4 |========================================================== Clang 12.0 . 2708.4 |======================================================= GCC 11.1 ... 2792.3 |========================================================= Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Compression Speed MB/s > Higher Is Better AOCC 3.1 ... 42.1 |=========================================================== Clang 12.0 . 42.9 |============================================================ GCC 11.1 ... 37.6 |===================================================== Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Decompression Speed MB/s > Higher Is Better AOCC 3.1 ... 3264.4 |======================================================= Clang 12.0 . 3315.9 |======================================================== GCC 11.1 ... 3434.5 |========================================================== JPEG XL 0.3.3 Input: PNG - Encode Speed: 7 MP/s > Higher Is Better AOCC 3.1 ... 9.04 |========================================================= Clang 12.0 . 9.49 |============================================================ JPEG XL 0.3.3 Input: JPEG - Encode Speed: 7 MP/s > Higher Is Better AOCC 3.1 ... 65.89 |=========================================================== Clang 12.0 . 64.35 |========================================================== JPEG XL 0.3.3 Input: JPEG - Encode Speed: 8 MP/s > Higher Is Better AOCC 3.1 ... 28.30 |=========================================================== Clang 12.0 . 27.78 |========================================================== Botan 2.17.3 Test: AES-256 MiB/s > Higher Is Better AOCC 3.1 ... 4859.66 |=============================================== Clang 12.0 . 4929.67 |================================================ GCC 11.1 ... 5849.13 |========================================================= Botan 2.17.3 Test: AES-256 - Decrypt MiB/s > Higher Is Better AOCC 3.1 ... 4876.96 |=============================================== Clang 12.0 . 4924.97 |================================================ GCC 11.1 ... 5866.48 |========================================================= Botan 2.17.3 Test: Twofish MiB/s > Higher Is Better AOCC 3.1 ... 332.81 |========================================================== Clang 12.0 . 334.18 |========================================================== GCC 11.1 ... 331.41 |========================================================== Botan 2.17.3 Test: Twofish - Decrypt MiB/s > Higher Is Better AOCC 3.1 ... 339.97 |========================================================== Clang 12.0 . 339.90 |========================================================== GCC 11.1 ... 332.40 |========================================================= Botan 2.17.3 Test: Blowfish MiB/s > Higher Is Better AOCC 3.1 ... 417.05 |========================================================== Clang 12.0 . 398.71 |======================================================= GCC 11.1 ... 407.15 |========================================================= Botan 2.17.3 Test: Blowfish - Decrypt MiB/s > Higher Is Better AOCC 3.1 ... 405.31 |========================================================== Clang 12.0 . 404.37 |========================================================== GCC 11.1 ... 407.58 |========================================================== Botan 2.17.3 Test: CAST-256 MiB/s > Higher Is Better AOCC 3.1 ... 134.37 |======================================================== Clang 12.0 . 140.29 |========================================================== GCC 11.1 ... 133.68 |======================================================= Botan 2.17.3 Test: CAST-256 - Decrypt MiB/s > Higher Is Better AOCC 3.1 ... 137.91 |========================================================= Clang 12.0 . 140.54 |========================================================== GCC 11.1 ... 133.62 |======================================================= LibRaw 0.20 Post-Processing Benchmark Mpix/sec > Higher Is Better AOCC 3.1 ... 45.11 |============================================ Clang 12.0 . 43.68 |========================================== GCC 11.1 ... 60.93 |=========================================================== John The Ripper 1.9.0-jumbo-1 Test: Blowfish Real C/S > Higher Is Better AOCC 3.1 ... 61842 |========================================================== Clang 12.0 . 62745 |=========================================================== John The Ripper 1.9.0-jumbo-1 Test: MD5 Real C/S > Higher Is Better AOCC 3.1 ... 2240000 |========================================================= Clang 12.0 . 2043333 |==================================================== GraphicsMagick 1.3.33 Operation: Rotate Iterations Per Minute > Higher Is Better AOCC 3.1 ... 792 |========================================================= Clang 12.0 . 820 |=========================================================== GCC 11.1 ... 848 |============================================================= GraphicsMagick 1.3.33 Operation: Enhanced Iterations Per Minute > Higher Is Better AOCC 3.1 ... 696 |============================================================= Clang 12.0 . 666 |========================================================== GCC 11.1 ... 658 |========================================================== SVT-AV1 0.8.7 Encoder Mode: Preset 4 - Input: Bosphorus 4K Frames Per Second > Higher Is Better AOCC 3.1 ... 1.987 |=========================================================== Clang 12.0 . 1.988 |=========================================================== GCC 11.1 ... 1.937 |========================================================= SVT-AV1 0.8.7 Encoder Mode: Preset 8 - Input: Bosphorus 4K Frames Per Second > Higher Is Better AOCC 3.1 ... 19.47 |=========================================================== Clang 12.0 . 19.40 |=========================================================== GCC 11.1 ... 18.87 |========================================================= SVT-HEVC 1.5.0 Tuning: 7 - Input: Bosphorus 1080p Frames Per Second > Higher Is Better AOCC 3.1 ... 303.75 |========================================================== Clang 12.0 . 295.67 |======================================================== GCC 11.1 ... 302.07 |========================================================== SVT-HEVC 1.5.0 Tuning: 10 - Input: Bosphorus 1080p Frames Per Second > Higher Is Better AOCC 3.1 ... 408.51 |============================================= Clang 12.0 . 388.86 |=========================================== GCC 11.1 ... 529.50 |========================================================== SVT-VP9 0.3 Tuning: Visual Quality Optimized - Input: Bosphorus 1080p Frames Per Second > Higher Is Better AOCC 3.1 ... 234.18 |===================================================== Clang 12.0 . 240.65 |====================================================== GCC 11.1 ... 256.91 |========================================================== VP9 libvpx Encoding 1.10.0 Speed: Speed 5 - Input: Bosphorus 4K Frames Per Second > Higher Is Better AOCC 3.1 ... 15.62 |====================================================== Clang 12.0 . 17.12 |=========================================================== GCC 11.1 ... 16.25 |======================================================== Himeno Benchmark 3.0 Poisson Pressure Solver MFLOPS > Higher Is Better AOCC 3.1 ... 3819.99 |=================================================== Clang 12.0 . 4113.00 |======================================================= GCC 11.1 ... 4237.38 |========================================================= Stockfish 13 Total Time Nodes Per Second > Higher Is Better AOCC 3.1 . 90448030 |========================================================= GCC 11.1 . 91863789 |========================================================== libavif avifenc 0.9.0 Encoder Speed: 2 Seconds < Lower Is Better AOCC 3.1 ... 24.77 |======================================================== Clang 12.0 . 24.40 |======================================================= GCC 11.1 ... 26.21 |=========================================================== libavif avifenc 0.9.0 Encoder Speed: 6 Seconds < Lower Is Better AOCC 3.1 ... 9.672 |====================================================== Clang 12.0 . 9.632 |====================================================== GCC 11.1 ... 10.376 |========================================================== libavif avifenc 0.9.0 Encoder Speed: 10 Seconds < Lower Is Better AOCC 3.1 ... 3.497 |======================================================== Clang 12.0 . 3.517 |======================================================== GCC 11.1 ... 3.698 |=========================================================== libavif avifenc 0.9.0 Encoder Speed: 6, Lossless Seconds < Lower Is Better AOCC 3.1 ... 26.67 |======================================================== Clang 12.0 . 26.10 |======================================================= GCC 11.1 ... 27.89 |=========================================================== libavif avifenc 0.9.0 Encoder Speed: 10, Lossless Seconds < Lower Is Better AOCC 3.1 ... 5.800 |======================================================= Clang 12.0 . 5.966 |========================================================= GCC 11.1 ... 6.212 |=========================================================== POV-Ray 3.7.0.7 Trace Time Seconds < Lower Is Better AOCC 3.1 ... 14.78 |========================================================= Clang 12.0 . 15.14 |========================================================== GCC 11.1 ... 15.36 |=========================================================== oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU ms < Lower Is Better AOCC 3.1 ... 1.40164 |======================================================== Clang 12.0 . 1.40867 |======================================================== GCC 11.1 ... 1.42555 |========================================================= oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU ms < Lower Is Better AOCC 3.1 ... 5.07547 |========================================================= Clang 12.0 . 5.00842 |======================================================== GCC 11.1 ... 5.03445 |========================================================= oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU ms < Lower Is Better AOCC 3.1 ... 1.40674 |============================= Clang 12.0 . 2.77765 |========================================================= GCC 11.1 ... 1.91827 |======================================= oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU ms < Lower Is Better AOCC 3.1 ... 1.80096 |====================== Clang 12.0 . 1.83657 |======================= GCC 11.1 ... 4.59273 |========================================================= oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU ms < Lower Is Better AOCC 3.1 ... 3.13186 |========================================================= Clang 12.0 . 3.10964 |========================================================= GCC 11.1 ... 2.95414 |====================================================== oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU ms < Lower Is Better AOCC 3.1 ... 6432.49 |====================================================== Clang 12.0 . 6828.04 |========================================================= GCC 11.1 ... 6221.98 |==================================================== oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU ms < Lower Is Better AOCC 3.1 ... 668.79 |======================================================= Clang 12.0 . 679.05 |======================================================== GCC 11.1 ... 702.08 |========================================================== oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU ms < Lower Is Better AOCC 3.1 ... 0.435999 |===================================================== Clang 12.0 . 0.436906 |===================================================== GCC 11.1 ... 0.457865 |======================================================== FLAC Audio Encoding 1.3.2 WAV To FLAC Seconds < Lower Is Better AOCC 3.1 ... 8.700 |=========================================================== Clang 12.0 . 7.326 |================================================== GCC 11.1 ... 8.161 |======================================================= LAME MP3 Encoding 3.100 WAV To MP3 Seconds < Lower Is Better AOCC 3.1 ... 7.828 |=========================================================== Clang 12.0 . 7.826 |=========================================================== GCC 11.1 ... 7.099 |====================================================== Ngspice 34 Circuit: C2670 Seconds < Lower Is Better AOCC 3.1 ... 91.52 |====================================================== Clang 12.0 . 92.73 |======================================================= GCC 11.1 ... 99.09 |=========================================================== Ngspice 34 Circuit: C7552 Seconds < Lower Is Better AOCC 3.1 ... 82.28 |========================================================== Clang 12.0 . 83.89 |=========================================================== GCC 11.1 ... 84.14 |=========================================================== RNNoise 2020-06-28 Seconds < Lower Is Better AOCC 3.1 ... 18.73 |=========================================================== Clang 12.0 . 18.56 |========================================================== GCC 11.1 ... 18.01 |========================================================= Tachyon 0.99b6 Total Time Seconds < Lower Is Better AOCC 3.1 ... 28.66 |=========================================================== Clang 12.0 . 28.66 |=========================================================== GCC 11.1 ... 27.64 |========================================================= Google SynthMark 20201109 Test: VoiceMark_100 Voices > Higher Is Better AOCC 3.1 ... 621.63 |========================================================== Clang 12.0 . 621.43 |========================================================== SecureMark 1.0.4 Benchmark: SecureMark-TLS marks > Higher Is Better AOCC 3.1 ... 279931 |========================================================= Clang 12.0 . 285172 |========================================================== GCC 11.1 ... 257191 |==================================================== Liquid-DSP 2021.01.31 Threads: 16 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better AOCC 3.1 ... 806510000 |============================================== Clang 12.0 . 886793333 |=================================================== GCC 11.1 ... 954493333 |======================================================= Liquid-DSP 2021.01.31 Threads: 32 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better AOCC 3.1 ... 1527933333 |================================================ Clang 12.0 . 1651733333 |==================================================== GCC 11.1 ... 1719166667 |====================================================== Liquid-DSP 2021.01.31 Threads: 64 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better AOCC 3.1 ... 1894966667 |================================================= Clang 12.0 . 2068366667 |====================================================== GCC 11.1 ... 1873666667 |================================================= FinanceBench 2016-07-25 Benchmark: Repo OpenMP ms < Lower Is Better AOCC 3.1 ... 33763.58 |====================================================== Clang 12.0 . 33674.40 |====================================================== GCC 11.1 ... 35022.33 |======================================================== FinanceBench 2016-07-25 Benchmark: Bonds OpenMP ms < Lower Is Better AOCC 3.1 ... 51185.49 |======================================================== Clang 12.0 . 51304.84 |======================================================== GCC 11.1 ... 50564.02 |======================================================= libjpeg-turbo tjbench 2.1.0 Test: Decompression Throughput Megapixels/sec > Higher Is Better AOCC 3.1 ... 220.67 |========================================================== Clang 12.0 . 214.48 |======================================================== GCC 11.1 ... 210.80 |======================================================= ASTC Encoder 3.0 Preset: Exhaustive Seconds < Lower Is Better AOCC 3.1 . 21.89 |========================================================= GCC 11.1 . 23.26 |============================================================= SQLite Speedtest 3.30 Timed Time - Size 1,000 Seconds < Lower Is Better AOCC 3.1 ... 55.24 |=========================================================== Clang 12.0 . 54.80 |=========================================================== GCC 11.1 ... 53.78 |========================================================= Google Draco 1.4.1 Model: Lion ms < Lower Is Better AOCC 3.1 ... 4960 |====================================================== Clang 12.0 . 5490 |============================================================ Google Draco 1.4.1 Model: Church Facade ms < Lower Is Better AOCC 3.1 ... 6488 |==================================================== Clang 12.0 . 7469 |============================================================ NCNN 20210525 Target: CPU-v2-v2 - Model: mobilenet-v2 ms < Lower Is Better AOCC 3.1 ... 5.74 |============================================= Clang 12.0 . 6.82 |===================================================== GCC 11.1 ... 7.71 |============================================================ NCNN 20210525 Target: CPU-v3-v3 - Model: mobilenet-v3 ms < Lower Is Better AOCC 3.1 ... 5.60 |=============================================== Clang 12.0 . 6.77 |========================================================= GCC 11.1 ... 7.09 |============================================================ NCNN 20210525 Target: CPU - Model: vgg16 ms < Lower Is Better AOCC 3.1 ... 51.60 |=================================================== Clang 12.0 . 59.74 |=========================================================== GCC 11.1 ... 56.42 |======================================================== NCNN 20210525 Target: CPU - Model: resnet18 ms < Lower Is Better AOCC 3.1 ... 11.88 |===================================================== Clang 12.0 . 13.15 |=========================================================== GCC 11.1 ... 11.19 |================================================== TNN 0.3 Target: CPU - Model: SqueezeNet v2 ms < Lower Is Better AOCC 3.1 ... 60.66 |======================================================= Clang 12.0 . 61.22 |======================================================== GCC 11.1 ... 64.79 |=========================================================== TNN 0.3 Target: CPU - Model: SqueezeNet v1.1 ms < Lower Is Better AOCC 3.1 ... 266.24 |========================================================= Clang 12.0 . 265.12 |========================================================= GCC 11.1 ... 270.54 |========================================================== Facebook RocksDB 6.22.1 Test: Update Random Op/s > Higher Is Better AOCC 3.1 ... 832651 |========================================================== Clang 12.0 . 812665 |========================================================= GCC 11.1 ... 832118 |========================================================== Facebook RocksDB 6.22.1 Test: Read While Writing Op/s > Higher Is Better AOCC 3.1 ... 6135560 |======================================================= Clang 12.0 . 5822258 |==================================================== GCC 11.1 ... 6369004 |========================================================= Facebook RocksDB 6.22.1 Test: Read Random Write Random Op/s > Higher Is Better AOCC 3.1 ... 3059068 |======================================================== Clang 12.0 . 2875994 |==================================================== GCC 11.1 ... 3139993 |========================================================= ONNX Runtime 1.6 Model: bertsquad-10 - Device: OpenMP CPU Inferences Per Minute > Higher Is Better AOCC 3.1 ... 487 |============================================================= Clang 12.0 . 480 |============================================================ GCC 11.1 ... 432 |====================================================== ONNX Runtime 1.6 Model: fcn-resnet101-11 - Device: OpenMP CPU Inferences Per Minute > Higher Is Better AOCC 3.1 ... 90 |============================================================== Clang 12.0 . 85 |=========================================================== GCC 11.1 ... 89 |============================================================= ONNX Runtime 1.6 Model: super-resolution-10 - Device: OpenMP CPU Inferences Per Minute > Higher Is Better AOCC 3.1 ... 4568 |======================================================== Clang 12.0 . 4630 |======================================================== GCC 11.1 ... 4922 |============================================================ WavPack Audio Encoding 5.3 WAV To WavPack Seconds < Lower Is Better AOCC 3.1 ... 13.18 |=========================================================== Clang 12.0 . 13.17 |=========================================================== GCC 11.1 ... 13.13 |=========================================================== GnuPG 2.2.27 2.7GB Sample File Encryption Seconds < Lower Is Better AOCC 3.1 ... 67.61 |=========================================================== Clang 12.0 . 67.90 |=========================================================== GCC 11.1 ... 67.49 |=========================================================== Geometric Mean Of All Test Results Result Composite - AMD AOCC 3.1 Compiler Comparison Geometric Mean > Higher Is Better AOCC 3.1 ... 154.72 |========================================================== Clang 12.0 . 152.20 |========================================================= GCC 11.1 ... 150.45 |========================================================