AmpereOne GCC Clang Compiler Benchmarking AmpereOne compiler testing by Michael Larabel for a future article. GCC 13.2 - Default: Processor: AmpereOne @ 3.20GHz (192 Cores), Motherboard: Supermicro ARS-211M-NR R13SPD v1.02 (T20240726102529 BIOS), Chipset: Ampere Computing LLC Device e208, Memory: 8 x 64GB DDR5-5200MT/s, Disk: 3841GB SAMSUNG MZQL23T8HCLS-00A07 + 960GB SAMSUNG MZ1L2960HCJR-00A07, Graphics: ASPEED, Monitor: VGA HDMI, Network: 2 x Broadcom BCM57414 NetXtreme-E 10Gb/25Gb + 2 x Mellanox MT2892 OS: Ubuntu 24.04, Kernel: 6.11.0-061100rc6daily20240904-generic-64k (aarch64), Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1920x1080 Clang 18.1.3: Processor: AmpereOne @ 3.20GHz (192 Cores), Motherboard: Supermicro ARS-211M-NR R13SPD v1.02 (T20240726102529 BIOS), Chipset: Ampere Computing LLC Device e208, Memory: 8 x 64GB DDR5-5200MT/s, Disk: 3841GB SAMSUNG MZQL23T8HCLS-00A07 + 960GB SAMSUNG MZ1L2960HCJR-00A07, Graphics: ASPEED, Monitor: VGA HDMI, Network: 2 x Broadcom BCM57414 NetXtreme-E 10Gb/25Gb + 2 x Mellanox MT2892 OS: Ubuntu 24.04, Kernel: 6.11.0-061100rc6daily20240904-generic-64k (aarch64), Compiler: Clang 18.1.3, File-System: ext4, Screen Resolution: 1920x1080 Clang 19.1.0: Processor: AmpereOne @ 3.20GHz (192 Cores), Motherboard: Supermicro ARS-211M-NR R13SPD v1.02 (T20240726102529 BIOS), Chipset: Ampere Computing LLC Device e208, Memory: 8 x 64GB DDR5-5200MT/s, Disk: 3841GB SAMSUNG MZQL23T8HCLS-00A07 + 960GB SAMSUNG MZ1L2960HCJR-00A07, Graphics: ASPEED, Monitor: VGA HDMI, Network: 2 x Broadcom BCM57414 NetXtreme-E 10Gb/25Gb + 2 x Mellanox MT2892 OS: Ubuntu 24.04, Kernel: 6.11.0-061100rc6daily20240904-generic-64k (aarch64), Compiler: Clang 19.1.0, File-System: ext4, Screen Resolution: 1920x1080 Clang 20.0 Git: Processor: AmpereOne @ 3.20GHz (192 Cores), Motherboard: Supermicro ARS-211M-NR R13SPD v1.02 (T20240726102529 BIOS), Chipset: Ampere Computing LLC Device e208, Memory: 8 x 64GB DDR5-5200MT/s, Disk: 3841GB SAMSUNG MZQL23T8HCLS-00A07 + 960GB SAMSUNG MZ1L2960HCJR-00A07, Graphics: ASPEED, Monitor: VGA HDMI, Network: 2 x Broadcom BCM57414 NetXtreme-E 10Gb/25Gb + 2 x Mellanox MT2892 OS: Ubuntu 24.04, Kernel: 6.11.0-061100rc6daily20240904-generic-64k (aarch64), Compiler: Clang 20.0.0, File-System: ext4, Screen Resolution: 1920x1080 TSCP 1.81 AI Chess Performance Nodes Per Second > Higher Is Better GCC 13.2 - Default . 1387350 |=========================================== Clang 18.1.3 ....... 1549749 |================================================ Clang 19.1.0 ....... 1578160 |================================================= Clang 20.0 Git ..... 1579974 |================================================= FLAC Audio Encoding 1.4 WAV To FLAC Seconds < Lower Is Better GCC 13.2 - Default . 24.00 |=================================================== Clang 18.1.3 ....... 23.05 |================================================= Clang 19.1.0 ....... 22.80 |================================================ Clang 20.0 Git ..... 22.82 |================================================ Opus Codec Encoding 1.4 WAV To Opus Encode Seconds < Lower Is Better GCC 13.2 - Default . 72.95 |=================================================== Clang 18.1.3 ....... 68.88 |================================================ Clang 19.1.0 ....... 68.14 |================================================ Clang 20.0 Git ..... 68.57 |================================================ Etcpak 2.0 Benchmark: Multi-Threaded - Configuration: ETC2 Mpx/s > Higher Is Better GCC 13.2 - Default . 4303.51 |================================================ Clang 18.1.3 ....... 4339.78 |================================================ Clang 19.1.0 ....... 4381.34 |================================================= Clang 20.0 Git ..... 4385.16 |================================================= WebP Image Encode 1.4 Encode Settings: Default MP/s > Higher Is Better GCC 13.2 - Default . 9.26 |========================================== Clang 18.1.3 ....... 11.00 |================================================== Clang 19.1.0 ....... 11.20 |=================================================== Clang 20.0 Git ..... 11.28 |=================================================== WebP Image Encode 1.4 Encode Settings: Quality 100 MP/s > Higher Is Better GCC 13.2 - Default . 6.60 |============================================= Clang 18.1.3 ....... 7.43 |=================================================== Clang 19.1.0 ....... 7.56 |==================================================== Clang 20.0 Git ..... 7.56 |==================================================== WebP Image Encode 1.4 Encode Settings: Quality 100, Lossless MP/s > Higher Is Better GCC 13.2 - Default . 1.12 |================================================= Clang 18.1.3 ....... 1.19 |==================================================== Clang 19.1.0 ....... 1.18 |==================================================== Clang 20.0 Git ..... 1.18 |==================================================== WebP Image Encode 1.4 Encode Settings: Quality 100, Highest Compression MP/s > Higher Is Better GCC 13.2 - Default . 2.80 |========================================== Clang 18.1.3 ....... 3.33 |================================================== Clang 19.1.0 ....... 3.43 |==================================================== Clang 20.0 Git ..... 3.44 |==================================================== WebP Image Encode 1.4 Encode Settings: Quality 100, Lossless, Highest Compression MP/s > Higher Is Better GCC 13.2 - Default . 0.39 |============================================ Clang 18.1.3 ....... 0.46 |==================================================== Clang 19.1.0 ....... 0.46 |==================================================== Clang 20.0 Git ..... 0.46 |==================================================== Gcrypt Library 1.10.3 Seconds < Lower Is Better GCC 13.2 - Default . 324.22 |================================================== Clang 18.1.3 ....... 310.17 |================================================ Clang 19.1.0 ....... 310.96 |================================================ Clang 20.0 Git ..... 310.33 |================================================ SecureMark 1.0.4 Benchmark: SecureMark-TLS marks > Higher Is Better GCC 13.2 - Default . 171949 |================================================= Clang 18.1.3 ....... 172428 |================================================== Clang 19.1.0 ....... 174043 |================================================== Clang 20.0 Git ..... 173477 |================================================== QuantLib 1.32 Configuration: Multi-Threaded MFLOPS > Higher Is Better GCC 13.2 - Default . 300689.6 |============================================== Clang 18.1.3 ....... 313766.7 |================================================ Clang 19.1.0 ....... 314045.6 |================================================ Clang 20.0 Git ..... 314473.7 |================================================ miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM1 GFInst/s > Higher Is Better GCC 13.2 - Default . 746.97 |================================================== Clang 18.1.3 ....... 690.39 |============================================== Clang 19.1.0 ....... 684.55 |============================================== Clang 20.0 Git ..... 684.36 |============================================== miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM1 Billion Interactions/s > Higher Is Better GCC 13.2 - Default . 29.88 |=================================================== Clang 18.1.3 ....... 27.62 |=============================================== Clang 19.1.0 ....... 27.38 |=============================================== Clang 20.0 Git ..... 27.38 |=============================================== miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM2 GFInst/s > Higher Is Better GCC 13.2 - Default . 783.60 |================================================== Clang 18.1.3 ....... 719.31 |============================================== Clang 19.1.0 ....... 719.62 |============================================== Clang 20.0 Git ..... 719.80 |============================================== miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM2 Billion Interactions/s > Higher Is Better GCC 13.2 - Default . 31.34 |=================================================== Clang 18.1.3 ....... 28.77 |=============================================== Clang 19.1.0 ....... 28.79 |=============================================== Clang 20.0 Git ..... 28.79 |=============================================== GROMACS 2024 Implementation: MPI CPU - Input: water_GMX50_bare Ns Per Day > Higher Is Better GCC 13.2 - Default . 7.497 |=================================================== Clang 18.1.3 ....... 6.888 |=============================================== Clang 19.1.0 ....... 6.894 |=============================================== Clang 20.0 Git ..... 6.884 |=============================================== LAMMPS Molecular Dynamics Simulator 23Jun2022 Model: 20k Atoms ns/day > Higher Is Better GCC 13.2 - Default . 55.02 |=================================================== Clang 18.1.3 ....... 55.33 |=================================================== Clang 19.1.0 ....... 55.05 |=================================================== Clang 20.0 Git ..... 55.22 |=================================================== LAMMPS Molecular Dynamics Simulator 23Jun2022 Model: Rhodopsin Protein ns/day > Higher Is Better GCC 13.2 - Default . 47.62 |============================================= Clang 18.1.3 ....... 54.53 |=================================================== Clang 19.1.0 ....... 54.41 |=================================================== Clang 20.0 Git ..... 47.41 |============================================ Primesieve 12.1 Length: 1e13 Seconds < Lower Is Better GCC 13.2 - Default . 14.01 |=================================================== Clang 18.1.3 ....... 14.04 |=================================================== Clang 19.1.0 ....... 14.01 |=================================================== Clang 20.0 Git ..... 14.00 |=================================================== 7-Zip Compression 24.05 Test: Compression Rating MIPS > Higher Is Better GCC 13.2 - Default . 786951 |================================================== Clang 18.1.3 ....... 755498 |================================================ Clang 19.1.0 ....... 748668 |================================================ Clang 20.0 Git ..... 750631 |================================================ 7-Zip Compression 24.05 Test: Decompression Rating MIPS > Higher Is Better GCC 13.2 - Default . 891624 |============================================= Clang 18.1.3 ....... 975049 |================================================= Clang 19.1.0 ....... 986522 |================================================== Clang 20.0 Git ..... 978715 |================================================== GraphicsMagick 1.3.43 Operation: Swirl Iterations Per Minute > Higher Is Better GCC 13.2 - Default . 990 |=================================================== Clang 18.1.3 ....... 995 |==================================================== Clang 19.1.0 ....... 1001 |==================================================== Clang 20.0 Git ..... 989 |=================================================== GraphicsMagick 1.3.43 Operation: Rotate Iterations Per Minute > Higher Is Better GCC 13.2 - Default . 258 |=================================================== Clang 18.1.3 ....... 268 |===================================================== Clang 19.1.0 ....... 267 |===================================================== Clang 20.0 Git ..... 256 |=================================================== GraphicsMagick 1.3.43 Operation: Sharpen Iterations Per Minute > Higher Is Better GCC 13.2 - Default . 466 |==================================================== Clang 18.1.3 ....... 475 |===================================================== Clang 19.1.0 ....... 473 |==================================================== Clang 20.0 Git ..... 478 |===================================================== GraphicsMagick 1.3.43 Operation: Enhanced Iterations Per Minute > Higher Is Better GCC 13.2 - Default . 373 |===================================================== Clang 18.1.3 ....... 368 |==================================================== Clang 19.1.0 ....... 353 |================================================== Clang 20.0 Git ..... 356 |=================================================== GraphicsMagick 1.3.43 Operation: Resizing Iterations Per Minute > Higher Is Better GCC 13.2 - Default . 219 |===================================================== Clang 18.1.3 ....... 215 |==================================================== Clang 19.1.0 ....... 217 |===================================================== Clang 20.0 Git ..... 216 |==================================================== GraphicsMagick 1.3.43 Operation: Noise-Gaussian Iterations Per Minute > Higher Is Better GCC 13.2 - Default . 259 |===================================================== Clang 18.1.3 ....... 247 |=================================================== Clang 19.1.0 ....... 245 |================================================== Clang 20.0 Git ..... 245 |================================================== GraphicsMagick 1.3.43 Operation: HWB Color Space Iterations Per Minute > Higher Is Better GCC 13.2 - Default . 429 |===================================================== Clang 18.1.3 ....... 397 |================================================= Clang 19.1.0 ....... 402 |================================================== Clang 20.0 Git ..... 404 |================================================== C-Ray 2.0 Resolution: 4K - Rays Per Pixel: 16 Seconds < Lower Is Better GCC 13.2 - Default . 21.69 |=================================================== Clang 18.1.3 ....... 18.64 |============================================ Clang 19.1.0 ....... 18.77 |============================================ Clang 20.0 Git ..... 20.19 |=============================================== C-Ray 2.0 Resolution: 5K - Rays Per Pixel: 16 Seconds < Lower Is Better GCC 13.2 - Default . 38.18 |=================================================== Clang 18.1.3 ....... 32.80 |============================================ Clang 19.1.0 ....... 33.06 |============================================ Clang 20.0 Git ..... 35.56 |================================================ POV-Ray 3.7.0.7 Trace Time Seconds < Lower Is Better GCC 13.2 - Default . 7.494 |============================================== Clang 18.1.3 ....... 8.255 |=================================================== Clang 19.1.0 ....... 7.973 |================================================= Clang 20.0 Git ..... 7.971 |================================================= libavif avifenc 1.0 Encoder Speed: 0 Seconds < Lower Is Better GCC 13.2 - Default . 185.86 |=================================== Clang 18.1.3 ....... 264.17 |================================================== Clang 19.1.0 ....... 235.76 |============================================= Clang 20.0 Git ..... 235.73 |============================================= libavif avifenc 1.0 Encoder Speed: 2 Seconds < Lower Is Better GCC 13.2 - Default . 115.60 |============================= Clang 18.1.3 ....... 197.15 |================================================== Clang 19.1.0 ....... 170.82 |=========================================== Clang 20.0 Git ..... 171.38 |=========================================== libavif avifenc 1.0 Encoder Speed: 6 Seconds < Lower Is Better GCC 13.2 - Default . 2.908 |================================================== Clang 18.1.3 ....... 2.992 |=================================================== Clang 19.1.0 ....... 2.949 |================================================== Clang 20.0 Git ..... 2.954 |================================================== libavif avifenc 1.0 Encoder Speed: 6, Lossless Seconds < Lower Is Better GCC 13.2 - Default . 5.687 |=================================================== Clang 18.1.3 ....... 5.730 |=================================================== Clang 19.1.0 ....... 5.676 |=================================================== Clang 20.0 Git ..... 5.685 |=================================================== libavif avifenc 1.0 Encoder Speed: 10, Lossless Seconds < Lower Is Better GCC 13.2 - Default . 4.411 |=================================================== Clang 18.1.3 ....... 4.411 |=================================================== Clang 19.1.0 ....... 4.385 |=================================================== Clang 20.0 Git ..... 4.369 |=================================================== Liquid-DSP 1.6 Threads: 128 - Buffer Length: 256 - Filter Length: 32 samples/s > Higher Is Better GCC 13.2 - Default . 2197800000 |============================= Clang 18.1.3 ....... 3500833333 |============================================== Clang 19.1.0 ....... 3499066667 |============================================== Clang 20.0 Git ..... 3499300000 |============================================== Liquid-DSP 1.6 Threads: 192 - Buffer Length: 256 - Filter Length: 32 samples/s > Higher Is Better GCC 13.2 - Default . 3148866667 |============================ Clang 18.1.3 ....... 5249800000 |============================================== Clang 19.1.0 ....... 5246666667 |============================================== Clang 20.0 Git ..... 5247633333 |============================================== Liquid-DSP 1.6 Threads: 128 - Buffer Length: 256 - Filter Length: 512 samples/s > Higher Is Better GCC 13.2 - Default . 152760000 |============================================ Clang 18.1.3 ....... 164420000 |=============================================== Clang 19.1.0 ....... 164456667 |=============================================== Clang 20.0 Git ..... 164463333 |=============================================== Liquid-DSP 1.6 Threads: 192 - Buffer Length: 256 - Filter Length: 512 samples/s > Higher Is Better GCC 13.2 - Default . 224140000 |=========================================== Clang 18.1.3 ....... 246590000 |=============================================== Clang 19.1.0 ....... 246603333 |=============================================== Clang 20.0 Git ..... 246653333 |=============================================== Helsing 1.0-beta Digit Range: 14 digit Seconds < Lower Is Better GCC 13.2 - Default . 33.57 |================================================== Clang 18.1.3 ....... 34.37 |=================================================== Clang 19.1.0 ....... 34.45 |=================================================== Clang 20.0 Git ..... 34.46 |=================================================== simdjson 3.10 Throughput Test: Kostya GB/s > Higher Is Better GCC 13.2 - Default . 1.18 |==================================================== Clang 18.1.3 ....... 1.15 |=================================================== simdjson 3.10 Throughput Test: TopTweet GB/s > Higher Is Better GCC 13.2 - Default . 2.21 |==================================================== Clang 18.1.3 ....... 2.18 |=================================================== simdjson 3.10 Throughput Test: LargeRandom GB/s > Higher Is Better GCC 13.2 - Default . 0.65 |==================================================== Clang 18.1.3 ....... 0.64 |=================================================== simdjson 3.10 Throughput Test: PartialTweets GB/s > Higher Is Better GCC 13.2 - Default . 2.15 |==================================================== Clang 18.1.3 ....... 2.12 |=================================================== simdjson 3.10 Throughput Test: DistinctUserID GB/s > Higher Is Better GCC 13.2 - Default . 2.22 |==================================================== Clang 18.1.3 ....... 2.20 |====================================================