apple m2 compilers Apple M2 compiler benchmarks for a future article by Michael Larabel. Clang: Processor: Apple M2 @ 2.42GHz (4 Cores / 8 Threads), Motherboard: Apple MacBook Air (13 h M2 2022), Memory: 8GB, Disk: 251GB APPLE SSD AP0256Z + 2 x 0GB APPLE SSD AP0256Z, Graphics: llvmpipe, Network: Broadcom Device 4433 + Broadcom Device 5f71 OS: Arch rolling, Kernel: 5.19.0-rc7-asahi-2-1-ARCH (aarch64), Desktop: KDE Plasma 5.25.4, Display Server: X Server 1.21.1.4, OpenGL: 4.5 Mesa 22.1.6 (LLVM 14.0.6 128 bits), Compiler: Clang 14.0.6, File-System: ext4, Screen Resolution: 2560x1600 GCC: Processor: Apple M2 @ 2.42GHz (4 Cores / 8 Threads), Motherboard: Apple MacBook Air (13 h M2 2022), Memory: 8GB, Disk: 251GB APPLE SSD AP0256Z + 2 x 0GB APPLE SSD AP0256Z, Graphics: llvmpipe, Network: Broadcom Device 4433 + Broadcom Device 5f71 OS: Arch rolling, Kernel: 5.19.0-rc7-asahi-2-1-ARCH (aarch64), Desktop: KDE Plasma 5.25.4, Display Server: X Server 1.21.1.4, OpenGL: 4.5 Mesa 22.1.6 (LLVM 14.0.6 128 bits), Compiler: GCC 12.1.0 + Clang 14.0.6, File-System: ext4, Screen Resolution: 2560x1600 JPEG XL libjxl 0.6.1 Input: PNG - Encode Speed: 5 MP/s > Higher Is Better Clang . 35.58 |================================================================ GCC ... 20.78 |===================================== Crypto++ 8.2 Test: Unkeyed Algorithms MiB/second > Higher Is Better Clang . 360.18 |====================================== GCC ... 590.72 |=============================================================== NCNN 20210720 Target: CPU-v2-v2 - Model: mobilenet-v2 ms < Lower Is Better Clang . 3.47 |================================================================= GCC ... 2.20 |========================================= Crypto++ 8.2 Test: Keyed Algorithms MiB/second > Higher Is Better Clang . 404.76 |=========================================== GCC ... 591.62 |=============================================================== GraphicsMagick 1.3.33 Operation: HWB Color Space Iterations Per Minute > Higher Is Better Clang . 950 |============================================= GCC ... 1361 |================================================================= TNN 0.3 Target: CPU - Model: DenseNet ms < Lower Is Better Clang . 7411.32 |============================================================== GCC ... 5229.21 |============================================ TNN 0.3 Target: CPU - Model: MobileNet v2 ms < Lower Is Better Clang . 426.56 |=============================================================== GCC ... 306.61 |============================================= GraphicsMagick 1.3.33 Operation: Noise-Gaussian Iterations Per Minute > Higher Is Better Clang . 121 |================================================= GCC ... 164 |================================================================== SciMark 2.0 Computational Test: Dense LU Matrix Factorization Mflops > Higher Is Better Clang . 6907.38 |============================================================== GCC ... 5124.15 |============================================== JPEG XL libjxl 0.6.1 Input: PNG - Encode Speed: 7 MP/s > Higher Is Better Clang . 13.12 |================================================================ GCC ... 9.78 |================================================ LAME MP3 Encoding 3.100 WAV To MP3 Seconds < Lower Is Better Clang . 7.573 |================================================================ GCC ... 5.678 |================================================ NCNN 20210720 Target: CPU - Model: resnet18 ms < Lower Is Better Clang . 6.16 |=================================================== GCC ... 7.84 |================================================================= Liquid-DSP 2021.01.31 Threads: 2 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better Clang . 76895333 |============================================================= GCC ... 60534667 |================================================ Liquid-DSP 2021.01.31 Threads: 1 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better Clang . 38463333 |============================================================= GCC ... 30318000 |================================================ Liquid-DSP 2021.01.31 Threads: 4 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better Clang . 153386667 |============================================================ GCC ... 121106667 |=============================================== Timed MrBayes Analysis 3.2.7 Primate Phylogeny Analysis Seconds < Lower Is Better Clang . 192.75 |================================================== GCC ... 240.57 |=============================================================== eSpeak-NG Speech Engine 20200907 Text-To-Speech Synthesis Seconds < Lower Is Better Clang . 20.30 |================================================================ GCC ... 16.93 |===================================================== NCNN 20220729 Target: CPU - Model: resnet18 ms < Lower Is Better Clang . 5.90 |====================================================== GCC ... 7.04 |================================================================= Ngspice 34 Circuit: C7552 Seconds < Lower Is Better Clang . 56.10 |====================================================== GCC ... 66.06 |================================================================ SciMark 2.0 Computational Test: Composite Mflops > Higher Is Better Clang . 2870.78 |============================================================== GCC ... 2454.22 |===================================================== Zstd Compression 1.5.0 Compression Level: 19 - Decompression Speed MB/s > Higher Is Better Clang . 3932.4 |======================================================= GCC ... 4513.1 |=============================================================== AOBench Size: 2048 x 2048 - Total Time Seconds < Lower Is Better Clang . 29.48 |================================================================ GCC ... 25.75 |======================================================== NCNN 20210720 Target: CPU - Model: squeezenet_ssd ms < Lower Is Better Clang . 12.64 |======================================================== GCC ... 14.40 |================================================================ NCNN 20210720 Target: CPU - Model: vgg16 ms < Lower Is Better Clang . 29.59 |======================================================== GCC ... 33.70 |================================================================ Liquid-DSP 2021.01.31 Threads: 8 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better Clang . 190544667 |============================================================ GCC ... 167353333 |===================================================== Zstd Compression 1.5.0 Compression Level: 3 - Decompression Speed MB/s > Higher Is Better Clang . 4234.8 |======================================================= GCC ... 4821.4 |=============================================================== Coremark 1.0 CoreMark Size 666 - Iterations Per Second Iterations/Sec > Higher Is Better Clang . 175753.76 |===================================================== GCC ... 199947.69 |============================================================ GraphicsMagick 1.3.33 Operation: Resizing Iterations Per Minute > Higher Is Better Clang . 613 |================================================================== GCC ... 539 |========================================================== Zstd Compression 1.5.0 Compression Level: 8 - Decompression Speed MB/s > Higher Is Better Clang . 4385.7 |======================================================= GCC ... 4986.0 |=============================================================== JPEG XL libjxl 0.6.1 Input: JPEG - Encode Speed: 5 MP/s > Higher Is Better Clang . 114.37 |=============================================================== GCC ... 101.26 |======================================================== Zstd Compression 1.5.0 Compression Level: 8, Long Mode - Decompression Speed MB/s > Higher Is Better Clang . 4854.7 |======================================================== GCC ... 5478.3 |=============================================================== Zstd Compression 1.5.0 Compression Level: 3, Long Mode - Decompression Speed MB/s > Higher Is Better Clang . 4647.8 |======================================================== GCC ... 5241.1 |=============================================================== JPEG XL libjxl 0.6.1 Input: JPEG - Encode Speed: 7 MP/s > Higher Is Better Clang . 114.98 |=============================================================== GCC ... 102.13 |======================================================== SciMark 2.0 Computational Test: Fast Fourier Transform Mflops > Higher Is Better Clang . 500.91 |======================================================== GCC ... 563.13 |=============================================================== FLAC Audio Encoding 1.3.3 WAV To FLAC Seconds < Lower Is Better Clang . 41.52 |================================================================ GCC ... 37.25 |========================================================= Himeno Benchmark 3.0 Poisson Pressure Solver MFLOPS > Higher Is Better Clang . 8507.41 |============================================================== GCC ... 7634.59 |======================================================== Botan 2.17.3 Test: KASUMI MiB/s > Higher Is Better Clang . 93.79 |================================================================ GCC ... 84.30 |========================================================== Crypto++ 8.2 Test: Integer + Elliptic Curve Public Key Algorithms MiB/second > Higher Is Better Clang . 2161.31 |============================================================== GCC ... 1947.62 |======================================================== GraphicsMagick 1.3.33 Operation: Enhanced Iterations Per Minute > Higher Is Better Clang . 155 |============================================================ GCC ... 171 |================================================================== Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Decompression Speed MB/s > Higher Is Better Clang . 4146.6 |========================================================= GCC ... 4543.4 |=============================================================== Botan 2.17.3 Test: KASUMI - Decrypt MiB/s > Higher Is Better Clang . 91.83 |================================================================ GCC ... 84.11 |=========================================================== SciMark 2.0 Computational Test: Sparse Matrix Multiply Mflops > Higher Is Better Clang . 4508.82 |============================================================== GCC ... 4132.95 |========================================================= Basis Universal 1.13 Settings: UASTC Level 0 Seconds < Lower Is Better Clang . 6.548 |================================================================ GCC ... 6.003 |=========================================================== Basis Universal 1.13 Settings: ETC1S Seconds < Lower Is Better Clang . 26.16 |================================================================ GCC ... 24.13 |=========================================================== Xmrig 6.12.1 Variant: Monero - Hash Count: 1M H/s > Higher Is Better Clang . 2520.6 |=============================================================== GCC ... 2329.9 |========================================================== Sockperf 3.7 Test: Throughput Messages Per Second > Higher Is Better Clang . 793185 |=========================================================== GCC ... 847339 |=============================================================== TNN 0.3 Target: CPU - Model: SqueezeNet v2 ms < Lower Is Better Clang . 56.61 |================================================================ GCC ... 53.28 |============================================================ simdjson 2.0 Throughput Test: DistinctUserID GB/s > Higher Is Better Clang . 4.45 |================================================================= GCC ... 4.19 |============================================================= OpenSSL 3.0 Algorithm: SHA256 byte/s > Higher Is Better Clang . 9058229720 |=========================================================== GCC ... 8534811807 |======================================================== Xmrig 6.12.1 Variant: Wownero - Hash Count: 1M H/s > Higher Is Better Clang . 2676.2 |=============================================================== GCC ... 2541.9 |============================================================ libgav1 0.17 Video Input: Chimera 1080p 10-bit FPS > Higher Is Better Clang . 81.34 |================================================================ GCC ... 77.30 |============================================================= GraphicsMagick 1.3.33 Operation: Sharpen Iterations Per Minute > Higher Is Better Clang . 97 |=============================================================== GCC ... 102 |================================================================== Basis Universal 1.13 Settings: UASTC Level 2 Seconds < Lower Is Better Clang . 35.88 |================================================================ GCC ... 34.12 |============================================================= Botan 2.17.3 Test: Blowfish MiB/s > Higher Is Better Clang . 436.06 |=============================================================== GCC ... 415.07 |============================================================ Botan 2.17.3 Test: Blowfish - Decrypt MiB/s > Higher Is Better Clang . 436.82 |=============================================================== GCC ... 416.06 |============================================================ Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Compression Speed MB/s > Higher Is Better Clang . 21.0 |============================================================== GCC ... 21.9 |================================================================= simdjson 2.0 Throughput Test: TopTweet GB/s > Higher Is Better Clang . 4.43 |================================================================= GCC ... 4.25 |============================================================== Botan 2.17.3 Test: ChaCha20Poly1305 - Decrypt MiB/s > Higher Is Better Clang . 570.92 |=============================================================== GCC ... 547.82 |============================================================ simdjson 2.0 Throughput Test: PartialTweets GB/s > Higher Is Better Clang . 4.35 |================================================================= GCC ... 4.18 |============================================================== simdjson 2.0 Throughput Test: LargeRandom GB/s > Higher Is Better Clang . 1.00 |=============================================================== GCC ... 1.04 |================================================================= LAMMPS Molecular Dynamics Simulator 23Jun2022 Model: Rhodopsin Protein ns/day > Higher Is Better Clang . 3.440 |================================================================ GCC ... 3.315 |============================================================== libjpeg-turbo tjbench 2.1.0 Test: Decompression Throughput Megapixels/sec > Higher Is Better Clang . 214.70 |============================================================= GCC ... 222.67 |=============================================================== Botan 2.17.3 Test: ChaCha20Poly1305 MiB/s > Higher Is Better Clang . 578.63 |=============================================================== GCC ... 558.13 |============================================================= Zstd Compression 1.5.0 Compression Level: 8, Long Mode - Compression Speed MB/s > Higher Is Better Clang . 691.9 |============================================================== GCC ... 716.9 |================================================================ libgav1 0.17 Video Input: Chimera 1080p FPS > Higher Is Better Clang . 161.84 |=============================================================== GCC ... 156.32 |============================================================= Zstd Compression 1.5.0 Compression Level: 19 - Compression Speed MB/s > Higher Is Better Clang . 25.9 |=============================================================== GCC ... 26.8 |================================================================= SciMark 2.0 Computational Test: Monte Carlo Mflops > Higher Is Better Clang . 450.23 |============================================================= GCC ... 463.86 |=============================================================== Botan 2.17.3 Test: Twofish MiB/s > Higher Is Better Clang . 351.49 |=============================================================== GCC ... 341.75 |============================================================= POV-Ray 3.7.0.7 Trace Time Seconds < Lower Is Better Clang . 88.93 |============================================================== GCC ... 91.46 |================================================================ Zstd Compression 1.5.0 Compression Level: 3, Long Mode - Compression Speed MB/s > Higher Is Better Clang . 272.4 |================================================================ GCC ... 265.4 |============================================================== TNN 0.3 Target: CPU - Model: SqueezeNet v1.1 ms < Lower Is Better Clang . 330.09 |=============================================================== GCC ... 321.83 |============================================================= Opus Codec Encoding 1.3.1 WAV To Opus Encode Seconds < Lower Is Better Clang . 14.67 |================================================================ GCC ... 14.30 |============================================================== WavPack Audio Encoding 5.3 WAV To WavPack Seconds < Lower Is Better Clang . 17.98 |============================================================== GCC ... 18.42 |================================================================ SQLite Speedtest 3.30 Timed Time - Size 1,000 Seconds < Lower Is Better Clang . 45.45 |================================================================ GCC ... 44.64 |=============================================================== Zstd Compression 1.5.0 Compression Level: 8 - Compression Speed MB/s > Higher Is Better Clang . 880.2 |================================================================ GCC ... 866.1 |=============================================================== OpenSSL 3.0 Algorithm: RSA4096 verify/s > Higher Is Better Clang . 107954.2 |============================================================= GCC ... 106464.4 |============================================================ libgav1 0.17 Video Input: Summer Nature 1080p FPS > Higher Is Better Clang . 133.16 |=============================================================== GCC ... 131.35 |============================================================== JPEG XL libjxl 0.6.1 Input: JPEG - Encode Speed: 8 MP/s > Higher Is Better Clang . 38.65 |================================================================ GCC ... 38.16 |=============================================================== LuaJIT 2.1-git Test: Monte Carlo Mflops > Higher Is Better Clang . 424.27 |============================================================== GCC ... 429.71 |=============================================================== Google Draco 1.5.0 Model: Lion ms < Lower Is Better Clang . 3433 |================================================================ GCC ... 3476 |================================================================= dav1d 1.0 Video Input: Summer Nature 1080p FPS > Higher Is Better Clang . 527.87 |============================================================== GCC ... 534.01 |=============================================================== Gcrypt Library 1.9 Seconds < Lower Is Better Clang . 255.91 |============================================================== GCC ... 258.39 |=============================================================== dav1d 1.0 Video Input: Chimera 1080p 10-bit FPS > Higher Is Better Clang . 283.70 |=============================================================== GCC ... 280.98 |============================================================== libgav1 0.17 Video Input: Summer Nature 4K FPS > Higher Is Better Clang . 55.66 |================================================================ GCC ... 55.13 |=============================================================== GraphicsMagick 1.3.33 Operation: Rotate Iterations Per Minute > Higher Is Better Clang . 1571 |================================================================ GCC ... 1586 |================================================================= Google Draco 1.5.0 Model: Church Facade ms < Lower Is Better Clang . 5044 |================================================================= GCC ... 5080 |================================================================= simdjson 2.0 Throughput Test: Kostya GB/s > Higher Is Better Clang . 3.03 |================================================================= GCC ... 3.05 |================================================================= LuaJIT 2.1-git Test: Sparse Matrix Multiply Mflops > Higher Is Better Clang . 1902.44 |============================================================== GCC ... 1890.85 |============================================================== OpenSSL 3.0 Algorithm: RSA4096 sign/s > Higher Is Better Clang . 1529.5 |=============================================================== GCC ... 1520.3 |=============================================================== dav1d 1.0 Video Input: Chimera 1080p FPS > Higher Is Better Clang . 376.77 |=============================================================== GCC ... 375.87 |=============================================================== Basis Universal 1.13 Settings: UASTC Level 3 Seconds < Lower Is Better Clang . 70.16 |================================================================ GCC ... 70.00 |================================================================ OpenJPEG 2.4 Encode: NASA Curiosity Panorama M34 ms < Lower Is Better Clang . 48724 |================================================================ GCC ... 48829 |================================================================ Botan 2.17.3 Test: CAST-256 - Decrypt MiB/s > Higher Is Better Clang . 136.99 |=============================================================== GCC ... 137.21 |=============================================================== Botan 2.17.3 Test: Twofish - Decrypt MiB/s > Higher Is Better Clang . 347.60 |=============================================================== GCC ... 348.14 |=============================================================== LuaJIT 2.1-git Test: Composite Mflops > Higher Is Better Clang . 1387.75 |============================================================== GCC ... 1386.16 |============================================================== GnuPG 2.2.27 2.7GB Sample File Encryption Seconds < Lower Is Better Clang . 43.85 |================================================================ GCC ... 43.89 |================================================================ LuaJIT 2.1-git Test: Dense LU Matrix Factorization Mflops > Higher Is Better Clang . 3157.76 |============================================================== GCC ... 3156.17 |============================================================== LuaJIT 2.1-git Test: Fast Fourier Transform Mflops > Higher Is Better Clang . 560.45 |=============================================================== GCC ... 560.21 |=============================================================== Zstd Compression 1.5.0 Compression Level: 3 - Compression Speed MB/s > Higher Is Better Clang . 3525.2 |=============================================================== GCC ... 3526.6 |=============================================================== SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation Mflops > Higher Is Better Clang . 1986.56 |============================================================== GCC ... 1987.03 |============================================================== Botan 2.17.3 Test: CAST-256 MiB/s > Higher Is Better Clang . 136.76 |=============================================================== GCC ... 136.75 |=============================================================== LuaJIT 2.1-git Test: Jacobi Successive Over-Relaxation Mflops > Higher Is Better Clang . 893.81 |=============================================================== GCC ... 893.84 |=============================================================== NCNN 20220729 Target: CPU - Model: FastestDet ms < Lower Is Better Clang . 4.05 |================================================================= GCC ... 1.87 |============================== NCNN 20220729 Target: CPU - Model: vision_transformer ms < Lower Is Better Clang . 544.38 |=============================================================== GCC ... 501.17 |========================================================== NCNN 20220729 Target: CPU - Model: regnety_400m ms < Lower Is Better Clang . 10.17 |================================================================ GCC ... 5.19 |================================= NCNN 20220729 Target: CPU - Model: squeezenet_ssd ms < Lower Is Better Clang . 12.09 |================================================================ GCC ... 9.28 |================================================= NCNN 20220729 Target: CPU - Model: yolov4-tiny ms < Lower Is Better Clang . 20.24 |================================================================ GCC ... 14.51 |============================================== NCNN 20220729 Target: CPU - Model: resnet50 ms < Lower Is Better Clang . 28.71 |================================================================ GCC ... 15.24 |================================== NCNN 20220729 Target: CPU - Model: alexnet ms < Lower Is Better Clang . 11.80 |============================================================= GCC ... 12.39 |================================================================ NCNN 20220729 Target: CPU - Model: vgg16 ms < Lower Is Better Clang . 34.44 |================================================================ GCC ... 32.80 |============================================================= NCNN 20220729 Target: CPU - Model: googlenet ms < Lower Is Better Clang . 10.83 |====================================================== GCC ... 12.72 |================================================================ NCNN 20220729 Target: CPU - Model: blazeface ms < Lower Is Better Clang . 2.09 |================================================================= GCC ... 1.90 |=========================================================== NCNN 20220729 Target: CPU - Model: efficientnet-b0 ms < Lower Is Better Clang . 5.56 |================================================================= GCC ... 3.26 |====================================== NCNN 20220729 Target: CPU - Model: mnasnet ms < Lower Is Better Clang . 3.53 |================================================================= GCC ... 2.28 |========================================== NCNN 20220729 Target: CPU - Model: shufflenet-v2 ms < Lower Is Better Clang . 3.00 |================================================================= GCC ... 1.90 |========================================= NCNN 20220729 Target: CPU-v3-v3 - Model: mobilenet-v3 ms < Lower Is Better Clang . 3.25 |================================================================= GCC ... 2.20 |============================================ NCNN 20220729 Target: CPU-v2-v2 - Model: mobilenet-v2 ms < Lower Is Better Clang . 3.62 |================================================================= GCC ... 2.21 |======================================== NCNN 20220729 Target: CPU - Model: mobilenet ms < Lower Is Better Clang . 13.07 |================================================================ GCC ... 11.99 |=========================================================== NCNN 20210720 Target: CPU - Model: mnasnet ms < Lower Is Better Clang . 3.63 |================================================================= GCC ... 2.23 |======================================== NCNN 20210720 Target: CPU - Model: regnety_400m ms < Lower Is Better Clang . 10.38 |================================================================ GCC ... 5.28 |================================= NCNN 20210720 Target: CPU - Model: yolov4-tiny ms < Lower Is Better Clang . 20.18 |================================================================ GCC ... 14.67 |=============================================== NCNN 20210720 Target: CPU - Model: resnet50 ms < Lower Is Better Clang . 15.69 |================================================================ GCC ... 15.36 |=============================================================== NCNN 20210720 Target: CPU - Model: alexnet ms < Lower Is Better Clang . 11.15 |=========================================================== GCC ... 12.08 |================================================================ NCNN 20210720 Target: CPU - Model: googlenet ms < Lower Is Better Clang . 11.03 |======================================================== GCC ... 12.59 |================================================================ NCNN 20210720 Target: CPU - Model: blazeface ms < Lower Is Better Clang . 2.28 |================================================================= GCC ... 1.86 |===================================================== NCNN 20210720 Target: CPU - Model: efficientnet-b0 ms < Lower Is Better Clang . 6.55 |================================================================= GCC ... 3.82 |====================================== NCNN 20210720 Target: CPU - Model: shufflenet-v2 ms < Lower Is Better Clang . 3.13 |================================================================= GCC ... 1.92 |======================================== NCNN 20210720 Target: CPU-v3-v3 - Model: mobilenet-v3 ms < Lower Is Better Clang . 3.31 |================================================================= GCC ... 2.06 |======================================== NCNN 20210720 Target: CPU - Model: mobilenet ms < Lower Is Better Clang . 15.03 |================================================================ GCC ... 11.94 |=================================================== Ngspice 34 Circuit: C2670 Seconds < Lower Is Better Clang . 76.93 |================================================ GCC ... 100.08 |=============================================================== Primesieve 8.0 Length: 1e12 Seconds < Lower Is Better Clang . 40.37 |================================================================ GCC ... 39.10 |============================================================== C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel Seconds < Lower Is Better Clang . 95.54 |================================================================ GCC ... 79.03 |===================================================== dav1d 1.0 Video Input: Summer Nature 4K FPS > Higher Is Better Clang . 105.09 |========================================================== GCC ... 115.13 |=============================================================== GraphicsMagick 1.3.33 Operation: Swirl Iterations Per Minute > Higher Is Better Clang . 311 |================================================================ GCC ... 322 |==================================================================