AMD AOCC 2.2 vs. GCC 10 vs. Clang 10 - EPYC 7742 2P AMD AOCC 2.2 compiler against GCC 10, LLVM Clang 10. Benchmarks by Michael Larabel for a future article. AOCC 2.2: Processor: 2 x AMD EPYC 7742 64-Core @ 2.25GHz (128 Cores / 256 Threads), Motherboard: AMD DAYTONA_X (RDY1006G BIOS), Chipset: AMD Starship/Matisse, Memory: 504GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP, Graphics: ASPEED, Monitor: VE228, Network: 2 x Mellanox MT27710 OS: Ubuntu 20.10, Kernel: 5.4.0-42-generic (x86_64), Desktop: GNOME Shell 3.36.4, Display Server: X Server 1.20.8, Display Driver: modesetting 1.20.8, Compiler: Clang 10.0.0, File-System: ext4, Screen Resolution: 1920x1080 GCC 10.2: Processor: 2 x AMD EPYC 7742 64-Core @ 2.25GHz (128 Cores / 256 Threads), Motherboard: AMD DAYTONA_X (RDY1006G BIOS), Chipset: AMD Starship/Matisse, Memory: 504GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP, Graphics: ASPEED, Monitor: VE228, Network: 2 x Mellanox MT27710 OS: Ubuntu 20.10, Kernel: 5.4.0-42-generic (x86_64), Desktop: GNOME Shell 3.36.4, Display Server: X Server 1.20.8, Display Driver: modesetting 1.20.8, Compiler: GCC 10.2.0, File-System: ext4, Screen Resolution: 1920x1080 Clang 10.0.1: Processor: 2 x AMD EPYC 7742 64-Core @ 2.25GHz (128 Cores / 256 Threads), Motherboard: AMD DAYTONA_X (RDY1006G BIOS), Chipset: AMD Starship/Matisse, Memory: 504GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP, Graphics: ASPEED, Monitor: VE228, Network: 2 x Mellanox MT27710 OS: Ubuntu 20.10, Kernel: 5.4.0-42-generic (x86_64), Desktop: GNOME Shell 3.36.4, Display Server: X Server 1.20.8, Display Driver: modesetting 1.20.8, Compiler: Clang 10.0.1-1Target:, File-System: ext4, Screen Resolution: 1920x1080 Crypto++ 8.2 Test: Unkeyed Algorithms MiB/second > Higher Is Better AOCC 2.2 ..... 317.38 |======================================================== GCC 10.2 ..... 310.14 |======================================================= Clang 10.0.1 . 316.54 |======================================================== Timed MrBayes Analysis 3.2.7 Primate Phylogeny Analysis Seconds < Lower Is Better AOCC 2.2 ..... 106.58 |====================================================== GCC 10.2 ..... 109.59 |======================================================= Clang 10.0.1 . 111.35 |======================================================== Zstd Compression 1.4.5 Compression Level: 19 MB/s > Higher Is Better AOCC 2.2 ..... 128.3 |========================================================= GCC 10.2 ..... 125.7 |======================================================== Clang 10.0.1 . 120.7 |====================================================== SciMark 2.0 Computational Test: Monte Carlo Mflops > Higher Is Better AOCC 2.2 ..... 620.89 |======================================================== GCC 10.2 ..... 611.72 |======================================================= Clang 10.0.1 . 620.64 |======================================================== SciMark 2.0 Computational Test: Sparse Matrix Multiply Mflops > Higher Is Better AOCC 2.2 ..... 3450.97 |======================================================= GCC 10.2 ..... 2811.30 |============================================= Clang 10.0.1 . 3462.03 |======================================================= TSCP 1.81 AI Chess Performance Nodes Per Second > Higher Is Better AOCC 2.2 ..... 1163950 |====================================================== GCC 10.2 ..... 1014328 |=============================================== Clang 10.0.1 . 1178898 |======================================================= John The Ripper 1.9.0-jumbo-1 Test: Blowfish Real C/S > Higher Is Better AOCC 2.2 ..... 176970 |======================================================== GCC 10.2 ..... 149154 |=============================================== Clang 10.0.1 . 177644 |======================================================== GraphicsMagick 1.3.33 Operation: Rotate Iterations Per Minute > Higher Is Better AOCC 2.2 ..... 529 |=========================================================== GCC 10.2 ..... 497 |======================================================= Clang 10.0.1 . 527 |=========================================================== GraphicsMagick 1.3.33 Operation: Enhanced Iterations Per Minute > Higher Is Better AOCC 2.2 ..... 1108 |============================================= GCC 10.2 ..... 1439 |========================================================== Clang 10.0.1 . 1327 |===================================================== GraphicsMagick 1.3.33 Operation: Resizing Iterations Per Minute > Higher Is Better AOCC 2.2 ..... 351 |=========================================================== GCC 10.2 ..... 107 |================== Clang 10.0.1 . 162 |=========================== oneDNN 1.5 Harness: IP Batch 1D - Data Type: f32 - Engine: CPU ms < Lower Is Better AOCC 2.2 ..... 0.812279 |====================== GCC 10.2 ..... 1.983800 |====================================================== Clang 10.0.1 . 0.925616 |========================= oneDNN 1.5 Harness: IP Batch All - Data Type: f32 - Engine: CPU ms < Lower Is Better AOCC 2.2 ..... 12.68 |====================================== GCC 10.2 ..... 18.87 |========================================================= Clang 10.0.1 . 12.91 |======================================= oneDNN 1.5 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU ms < Lower Is Better AOCC 2.2 ..... 0.506847 |===================================== GCC 10.2 ..... 0.744088 |====================================================== Clang 10.0.1 . 0.537037 |======================================= oneDNN 1.5 Harness: Deconvolution Batch deconv_1d - Data Type: f32 - Engine: CPU ms < Lower Is Better AOCC 2.2 ..... 0.989625 |=================== GCC 10.2 ..... 2.826880 |====================================================== Clang 10.0.1 . 1.057690 |==================== oneDNN 1.5 Harness: Deconvolution Batch deconv_3d - Data Type: f32 - Engine: CPU ms < Lower Is Better AOCC 2.2 ..... 2.57847 |================================================ GCC 10.2 ..... 2.98436 |======================================================= Clang 10.0.1 . 2.62134 |================================================ oneDNN 1.5 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU ms < Lower Is Better AOCC 2.2 ..... 162.42 |========== GCC 10.2 ..... 870.72 |======================================================== Clang 10.0.1 . 219.27 |============== oneDNN 1.5 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU ms < Lower Is Better AOCC 2.2 ..... 61.87 |========== GCC 10.2 ..... 351.23 |======================================================== Clang 10.0.1 . 87.60 |============== oneDNN 1.5 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU ms < Lower Is Better AOCC 2.2 ..... 0.228572 |================= GCC 10.2 ..... 0.737722 |====================================================== Clang 10.0.1 . 0.272380 |==================== SVT-VP9 0.1 Tuning: VMAF Optimized - Input: Bosphorus 1080p Frames Per Second > Higher Is Better AOCC 2.2 ..... 392.64 |======================================================== GCC 10.2 ..... 376.61 |====================================================== Clang 10.0.1 . 366.54 |==================================================== SVT-VP9 0.1 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p Frames Per Second > Higher Is Better AOCC 2.2 ..... 412.51 |======================================================== GCC 10.2 ..... 382.50 |==================================================== Clang 10.0.1 . 388.72 |===================================================== SVT-VP9 0.1 Tuning: Visual Quality Optimized - Input: Bosphorus 1080p Frames Per Second > Higher Is Better AOCC 2.2 ..... 336.19 |======================================================== GCC 10.2 ..... 311.85 |==================================================== Clang 10.0.1 . 321.84 |====================================================== x264 2019-12-17 H.264 Video Encoding Frames Per Second > Higher Is Better AOCC 2.2 ..... 204.60 |======================================================== GCC 10.2 ..... 204.52 |======================================================== Clang 10.0.1 . 195.51 |====================================================== Timed Apache Compilation 2.4.41 Time To Compile Seconds < Lower Is Better AOCC 2.2 ..... 41.25 |========================================================= GCC 10.2 ..... 23.21 |================================ Clang 10.0.1 . 21.74 |============================== Timed FFmpeg Compilation 4.2.2 Time To Compile Seconds < Lower Is Better AOCC 2.2 ..... 39.87 |========================================================= GCC 10.2 ..... 16.44 |======================== Clang 10.0.1 . 21.82 |=============================== Timed MPlayer Compilation 1.4 Time To Compile Seconds < Lower Is Better AOCC 2.2 ..... 35.88 |========================================================= GCC 10.2 ..... 10.70 |================= Clang 10.0.1 . 18.09 |============================= Bullet Physics Engine 2.81 Test: 1000 Convex Seconds < Lower Is Better AOCC 2.2 ..... 4.557857 |=================================================== GCC 10.2 ..... 4.873698 |====================================================== Clang 10.0.1 . 4.558771 |=================================================== OpenSSL 1.1.1 RSA 4096-bit Performance Signs Per Second > Higher Is Better AOCC 2.2 ..... 18574.8 |========================================== GCC 10.2 ..... 24437.6 |======================================================= Clang 10.0.1 . 18618.0 |========================================== LevelDB 1.22 Benchmark: Hot Read Microseconds Per Op < Lower Is Better AOCC 2.2 ..... 286.17 |======================================================= GCC 10.2 ..... 289.70 |======================================================= Clang 10.0.1 . 293.72 |======================================================== ASTC Encoder 2.0 Preset: Fast Seconds < Lower Is Better AOCC 2.2 ..... 4.99 |==================================================== GCC 10.2 ..... 5.61 |========================================================== Clang 10.0.1 . 5.17 |===================================================== ASTC Encoder 2.0 Preset: Medium Seconds < Lower Is Better AOCC 2.2 ..... 5.67 |===================================================== GCC 10.2 ..... 6.25 |========================================================== Clang 10.0.1 . 5.73 |===================================================== ASTC Encoder 2.0 Preset: Thorough Seconds < Lower Is Better AOCC 2.2 ..... 8.04 |=================================================== GCC 10.2 ..... 9.09 |========================================================== Clang 10.0.1 . 8.11 |==================================================== ASTC Encoder 2.0 Preset: Exhaustive Seconds < Lower Is Better AOCC 2.2 ..... 19.95 |==================================================== GCC 10.2 ..... 22.00 |========================================================= Clang 10.0.1 . 19.71 |=================================================== CppPerformanceBenchmarks 9 Test: Ctype Seconds < Lower Is Better AOCC 2.2 ..... 40.21 |===================================================== GCC 10.2 ..... 43.27 |========================================================= Clang 10.0.1 . 40.34 |===================================================== CppPerformanceBenchmarks 9 Test: Math Library Seconds < Lower Is Better AOCC 2.2 ..... 334.75 |======================================================= GCC 10.2 ..... 339.96 |======================================================== Clang 10.0.1 . 331.62 |======================================================= CppPerformanceBenchmarks 9 Test: Stepanov Abstraction Seconds < Lower Is Better AOCC 2.2 ..... 34.01 |==================================================== GCC 10.2 ..... 36.97 |========================================================= Clang 10.0.1 . 33.55 |==================================================== Apache Benchmark 2.4.29 Static Web Page Serving Requests Per Second > Higher Is Better AOCC 2.2 ..... 27488.36 |====================================================== GCC 10.2 ..... 26062.36 |=================================================== Clang 10.0.1 . 26903.49 |=====================================================