Intel Core i9 11900K Compiler Benchmarks GCC 11.1 versus LLVM Clang 12 on Intel Core i9 11900K Rocket Lake. Benchmarks by Michael Larabel for a future article. GCC 11.1: -O2: Processor: Intel Core i9-11900K @ 5.10GHz (8 Cores / 16 Threads), Motherboard: ASUS ROG MAXIMUS XIII HERO (0707 BIOS), Chipset: Intel Tiger Lake-H, Memory: 32GB, Disk: 500GB Western Digital WDS500G3X0C-00SJG0 + 15GB Ultra USB 3.0, Graphics: AMD Radeon VII 16GB (1801/1000MHz), Audio: Intel Tiger Lake-H HD Audio, Monitor: ASUS MG28U, Network: 2 x Intel I225-V + Intel Wi-Fi 6 AX210/AX211/AX411 OS: Fedora 34, Kernel: 5.11.20-300.fc34.x86_64 (x86_64), Desktop: GNOME Shell 40.1, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 21.0.3 (LLVM 12.0.0), Compiler: GCC 11.1.1 20210428, File-System: btrfs, Screen Resolution: 3840x2160 GCC 11.1: -O3 -march=native: Processor: Intel Core i9-11900K @ 5.10GHz (8 Cores / 16 Threads), Motherboard: ASUS ROG MAXIMUS XIII HERO (0707 BIOS), Chipset: Intel Tiger Lake-H, Memory: 32GB, Disk: 500GB Western Digital WDS500G3X0C-00SJG0 + 15GB Ultra USB 3.0, Graphics: AMD Radeon VII 16GB (1801/1000MHz), Audio: Intel Tiger Lake-H HD Audio, Monitor: ASUS MG28U, Network: 2 x Intel I225-V + Intel Wi-Fi 6 AX210/AX211/AX411 OS: Fedora 34, Kernel: 5.11.20-300.fc34.x86_64 (x86_64), Desktop: GNOME Shell 40.1, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 21.0.3 (LLVM 12.0.0), Compiler: GCC 11.1.1 20210428, File-System: btrfs, Screen Resolution: 3840x2160 GCC 11.1: -O3 -march=native -flto: Processor: Intel Core i9-11900K @ 5.10GHz (8 Cores / 16 Threads), Motherboard: ASUS ROG MAXIMUS XIII HERO (0707 BIOS), Chipset: Intel Tiger Lake-H, Memory: 32GB, Disk: 500GB Western Digital WDS500G3X0C-00SJG0 + 15GB Ultra USB 3.0, Graphics: AMD Radeon VII 16GB (1801/1000MHz), Audio: Intel Tiger Lake-H HD Audio, Monitor: ASUS MG28U, Network: 2 x Intel I225-V + Intel Wi-Fi 6 AX210/AX211/AX411 OS: Fedora 34, Kernel: 5.11.20-300.fc34.x86_64 (x86_64), Desktop: GNOME Shell 40.1, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 21.0.3 (LLVM 12.0.0), Compiler: GCC 11.1.1 20210428, File-System: btrfs, Screen Resolution: 3840x2160 Clang 12: -O2: Processor: Intel Core i9-11900K @ 5.10GHz (8 Cores / 16 Threads), Motherboard: ASUS ROG MAXIMUS XIII HERO (0707 BIOS), Chipset: Intel Tiger Lake-H, Memory: 32GB, Disk: 500GB Western Digital WDS500G3X0C-00SJG0 + 15GB Ultra USB 3.0, Graphics: AMD Radeon VII 16GB (1801/1000MHz), Audio: Intel Tiger Lake-H HD Audio, Monitor: ASUS MG28U, Network: 2 x Intel I225-V + Intel Wi-Fi 6 AX210/AX211/AX411 OS: Fedora 34, Kernel: 5.11.20-300.fc34.x86_64 (x86_64), Desktop: GNOME Shell 40.1, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 21.0.3 (LLVM 12.0.0), Compiler: Clang 12.0.0, File-System: btrfs, Screen Resolution: 3840x2160 Clang 12: -O3 -march=native: Processor: Intel Core i9-11900K @ 5.10GHz (8 Cores / 16 Threads), Motherboard: ASUS ROG MAXIMUS XIII HERO (0707 BIOS), Chipset: Intel Tiger Lake-H, Memory: 32GB, Disk: 500GB Western Digital WDS500G3X0C-00SJG0 + 15GB Ultra USB 3.0, Graphics: AMD Radeon VII 16GB (1801/1000MHz), Audio: Intel Tiger Lake-H HD Audio, Monitor: ASUS MG28U, Network: 2 x Intel I225-V + Intel Wi-Fi 6 AX210/AX211/AX411 OS: Fedora 34, Kernel: 5.11.20-300.fc34.x86_64 (x86_64), Desktop: GNOME Shell 40.1, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 21.0.3 (LLVM 12.0.0), Compiler: Clang 12.0.0, File-System: btrfs, Screen Resolution: 3840x2160 Clang 12: -O3 -march=native -flto: Processor: Intel Core i9-11900K @ 5.10GHz (8 Cores / 16 Threads), Motherboard: ASUS ROG MAXIMUS XIII HERO (0707 BIOS), Chipset: Intel Tiger Lake-H, Memory: 32GB, Disk: 500GB Western Digital WDS500G3X0C-00SJG0 + 15GB Ultra USB 3.0, Graphics: AMD Radeon VII 16GB (1801/1000MHz), Audio: Intel Tiger Lake-H HD Audio, Monitor: ASUS MG28U, Network: 2 x Intel I225-V + Intel Wi-Fi 6 AX210/AX211/AX411 OS: Fedora 34, Kernel: 5.11.20-300.fc34.x86_64 (x86_64), Desktop: GNOME Shell 40.1, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 21.0.3 (LLVM 12.0.0), Compiler: Clang 12.0.0, File-System: btrfs, Screen Resolution: 3840x2160 Timed MrBayes Analysis 3.2.7 Primate Phylogeny Analysis Seconds < Lower Is Better GCC 11.1: -O2 ..................... 87.30 |==================================== GCC 11.1: -O3 -march=native ....... 86.70 |==================================== GCC 11.1: -O3 -march=native -flto . 84.93 |=================================== Clang 12: -O2 ..................... 85.57 |=================================== Clang 12: -O3 -march=native ....... 82.55 |================================== Clang 12: -O3 -march=native -flto . 83.56 |================================== Timed HMMer Search 3.3.2 Pfam Database Search Seconds < Lower Is Better GCC 11.1: -O2 ..................... 103.29 |=================================== GCC 11.1: -O3 -march=native ....... 100.74 |================================== GCC 11.1: -O3 -march=native -flto . 99.97 |================================== Clang 12: -O2 ..................... 101.50 |================================== Clang 12: -O3 -march=native ....... 99.61 |================================== Clang 12: -O3 -march=native -flto . 99.01 |================================== LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: Rhodopsin Protein ns/day > Higher Is Better GCC 11.1: -O2 ..................... 8.023 |=================================== GCC 11.1: -O3 -march=native ....... 8.067 |=================================== GCC 11.1: -O3 -march=native -flto . 8.328 |==================================== Clang 12: -O2 ..................... 8.140 |=================================== Clang 12: -O3 -march=native ....... 8.164 |=================================== Clang 12: -O3 -march=native -flto . 8.239 |==================================== WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless Encode Time - Seconds < Lower Is Better GCC 11.1: -O2 ..................... 13.76 |==================================== GCC 11.1: -O3 -march=native ....... 12.90 |================================== GCC 11.1: -O3 -march=native -flto . 12.71 |================================= Clang 12: -O2 ..................... 13.02 |================================== Clang 12: -O3 -march=native ....... 13.05 |================================== Clang 12: -O3 -march=native -flto . 12.89 |================================== WebP Image Encode 1.1 Encode Settings: Quality 100, Highest Compression Encode Time - Seconds < Lower Is Better GCC 11.1: -O2 ..................... 5.360 |==================================== GCC 11.1: -O3 -march=native ....... 5.127 |================================== GCC 11.1: -O3 -march=native -flto . 5.103 |================================== Clang 12: -O2 ..................... 4.885 |================================= Clang 12: -O3 -march=native ....... 4.760 |================================ Clang 12: -O3 -march=native -flto . 4.731 |================================ WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless, Highest Compression Encode Time - Seconds < Lower Is Better GCC 11.1: -O2 ..................... 27.84 |=================================== GCC 11.1: -O3 -march=native ....... 27.26 |=================================== GCC 11.1: -O3 -march=native -flto . 27.07 |================================== Clang 12: -O2 ..................... 28.26 |==================================== Clang 12: -O3 -march=native ....... 28.08 |==================================== Clang 12: -O3 -march=native -flto . 27.08 |================================== GraphicsMagick 1.3.33 Operation: Rotate Iterations Per Minute > Higher Is Better GCC 11.1: -O2 ..................... 1066 |=================================== GCC 11.1: -O3 -march=native ....... 1141 |===================================== GCC 11.1: -O3 -march=native -flto . 1072 |=================================== Clang 12: -O2 ..................... 1051 |================================== Clang 12: -O3 -march=native ....... 1080 |=================================== Clang 12: -O3 -march=native -flto . 1092 |=================================== GraphicsMagick 1.3.33 Operation: Sharpen Iterations Per Minute > Higher Is Better GCC 11.1: -O2 ..................... 164 |================================ GCC 11.1: -O3 -march=native ....... 195 |====================================== GCC 11.1: -O3 -march=native -flto . 195 |====================================== Clang 12: -O2 ..................... 162 |================================ Clang 12: -O3 -march=native ....... 163 |================================ Clang 12: -O3 -march=native -flto . 163 |================================ GraphicsMagick 1.3.33 Operation: Enhanced Iterations Per Minute > Higher Is Better GCC 11.1: -O2 ..................... 219 |=============================== GCC 11.1: -O3 -march=native ....... 270 |====================================== GCC 11.1: -O3 -march=native -flto . 269 |====================================== Clang 12: -O2 ..................... 217 |=============================== Clang 12: -O3 -march=native ....... 253 |==================================== Clang 12: -O3 -march=native -flto . 254 |==================================== GraphicsMagick 1.3.33 Operation: Resizing Iterations Per Minute > Higher Is Better GCC 11.1: -O2 ..................... 1091 |================================= GCC 11.1: -O3 -march=native ....... 1198 |==================================== GCC 11.1: -O3 -march=native -flto . 1229 |===================================== Clang 12: -O2 ..................... 1044 |=============================== Clang 12: -O3 -march=native ....... 1070 |================================ Clang 12: -O3 -march=native -flto . 1195 |==================================== SVT-HEVC 1.5.0 Tuning: 7 - Input: Bosphorus 1080p Frames Per Second > Higher Is Better GCC 11.1: -O2 ..................... 136.31 |================================= GCC 11.1: -O3 -march=native ....... 139.13 |================================= GCC 11.1: -O3 -march=native -flto . 141.83 |================================== Clang 12: -O2 ..................... 138.07 |================================= Clang 12: -O3 -march=native ....... 142.10 |================================== Clang 12: -O3 -march=native -flto . 146.01 |=================================== SVT-HEVC 1.5.0 Tuning: 10 - Input: Bosphorus 1080p Frames Per Second > Higher Is Better GCC 11.1: -O2 ..................... 273.60 |================================== GCC 11.1: -O3 -march=native ....... 278.72 |================================== GCC 11.1: -O3 -march=native -flto . 278.59 |================================== Clang 12: -O2 ..................... 271.50 |================================== Clang 12: -O3 -march=native ....... 276.29 |================================== Clang 12: -O3 -march=native -flto . 283.29 |=================================== SVT-VP9 0.3 Tuning: VMAF Optimized - Input: Bosphorus 1080p Frames Per Second > Higher Is Better GCC 11.1: -O2 ..................... 191.83 |================================== GCC 11.1: -O3 -march=native ....... 195.87 |================================== GCC 11.1: -O3 -march=native -flto . 195.07 |================================== Clang 12: -O2 ..................... 193.85 |================================== Clang 12: -O3 -march=native ....... 195.08 |================================== Clang 12: -O3 -march=native -flto . 199.49 |=================================== SVT-VP9 0.3 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p Frames Per Second > Higher Is Better GCC 11.1: -O2 ..................... 198.01 |================================== GCC 11.1: -O3 -march=native ....... 201.70 |=================================== GCC 11.1: -O3 -march=native -flto . 201.10 |=================================== Clang 12: -O2 ..................... 199.37 |================================== Clang 12: -O3 -march=native ....... 199.59 |================================== Clang 12: -O3 -march=native -flto . 203.83 |=================================== SVT-VP9 0.3 Tuning: Visual Quality Optimized - Input: Bosphorus 1080p Frames Per Second > Higher Is Better GCC 11.1: -O2 ..................... 160.65 |================================= GCC 11.1: -O3 -march=native ....... 164.77 |================================== GCC 11.1: -O3 -march=native -flto . 166.05 |=================================== Clang 12: -O2 ..................... 163.97 |================================== Clang 12: -O3 -march=native ....... 164.95 |================================== Clang 12: -O3 -march=native -flto . 167.95 |=================================== x265 3.4 Video Input: Bosphorus 4K Frames Per Second > Higher Is Better GCC 11.1: -O2 ..................... 15.64 |=================================== GCC 11.1: -O3 -march=native ....... 15.81 |==================================== GCC 11.1: -O3 -march=native -flto . 15.40 |=================================== Clang 12: -O2 ..................... 15.59 |=================================== Clang 12: -O3 -march=native ....... 15.52 |=================================== Clang 12: -O3 -march=native -flto . 15.93 |==================================== Coremark 1.0 CoreMark Size 666 - Iterations Per Second Iterations/Sec > Higher Is Better GCC 11.1: -O2 ..................... 430127.50 |================================ GCC 11.1: -O3 -march=native ....... 432583.96 |================================ GCC 11.1: -O3 -march=native -flto . 435901.44 |================================ Clang 12: -O2 ..................... 377278.13 |============================ Clang 12: -O3 -march=native ....... 366868.63 |=========================== Clang 12: -O3 -march=native -flto . 373279.58 |=========================== Himeno Benchmark 3.0 Poisson Pressure Solver MFLOPS > Higher Is Better GCC 11.1: -O2 ..................... 6305.48 |============================== GCC 11.1: -O3 -march=native ....... 6878.51 |================================= GCC 11.1: -O3 -march=native -flto . 7079.88 |================================== Clang 12: -O2 ..................... 6204.05 |============================== Clang 12: -O3 -march=native ....... 6291.58 |============================== Clang 12: -O3 -march=native -flto . 6434.83 |=============================== PJSIP 2.11 Method: INVITE Responses Per Second > Higher Is Better GCC 11.1: -O2 ..................... 5001 |===================================== GCC 11.1: -O3 -march=native ....... 4959 |==================================== GCC 11.1: -O3 -march=native -flto . 5058 |===================================== Clang 12: -O2 ..................... 4965 |==================================== Clang 12: -O3 -march=native ....... 5024 |===================================== PJSIP 2.11 Method: OPTIONS, Stateful Responses Per Second > Higher Is Better GCC 11.1: -O2 ..................... 9381 |===================================== GCC 11.1: -O3 -march=native ....... 9389 |===================================== GCC 11.1: -O3 -march=native -flto . 9395 |===================================== Clang 12: -O2 ..................... 9362 |===================================== Clang 12: -O3 -march=native ....... 9382 |===================================== PJSIP 2.11 Method: OPTIONS, Stateless Responses Per Second > Higher Is Better GCC 11.1: -O2 ..................... 239792 |=================================== GCC 11.1: -O3 -march=native ....... 241439 |=================================== GCC 11.1: -O3 -march=native -flto . 239892 |=================================== Clang 12: -O2 ..................... 241312 |=================================== Clang 12: -O3 -march=native ....... 241426 |=================================== C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel Seconds < Lower Is Better GCC 11.1: -O2 ..................... 106.52 |=================================== GCC 11.1: -O3 -march=native ....... 47.35 |================ GCC 11.1: -O3 -march=native -flto . 47.61 |================ Clang 12: -O2 ..................... 82.66 |=========================== Clang 12: -O3 -march=native ....... 84.05 |============================ Clang 12: -O3 -march=native -flto . 85.08 |============================ AOBench Size: 2048 x 2048 - Total Time Seconds < Lower Is Better GCC 11.1: -O2 ..................... 24.46 |=================================== GCC 11.1: -O3 -march=native ....... 21.54 |=============================== GCC 11.1: -O3 -march=native -flto . 21.58 |=============================== Clang 12: -O2 ..................... 25.00 |==================================== Clang 12: -O3 -march=native ....... 23.00 |================================= Clang 12: -O3 -march=native -flto . 22.90 |================================= FLAC Audio Encoding 1.3.2 WAV To FLAC Seconds < Lower Is Better GCC 11.1: -O2 ..................... 6.086 |============================= GCC 11.1: -O3 -march=native ....... 5.931 |============================ GCC 11.1: -O3 -march=native -flto . 5.936 |============================ Clang 12: -O2 ..................... 7.593 |==================================== Clang 12: -O3 -march=native ....... 5.956 |============================ Clang 12: -O3 -march=native -flto . 5.958 |============================ LAME MP3 Encoding 3.100 WAV To MP3 Seconds < Lower Is Better GCC 11.1: -O2 ..................... 7.304 |==================================== GCC 11.1: -O3 -march=native ....... 5.479 |=========================== GCC 11.1: -O3 -march=native -flto . 5.376 |========================== Clang 12: -O2 ..................... 7.034 |=================================== Clang 12: -O3 -march=native ....... 6.461 |================================ Clang 12: -O3 -march=native -flto . 6.205 |=============================== Opus Codec Encoding 1.3.1 WAV To Opus Encode Seconds < Lower Is Better GCC 11.1: -O2 ..................... 6.467 |==================================== GCC 11.1: -O3 -march=native ....... 5.587 |=============================== GCC 11.1: -O3 -march=native -flto . 5.575 |=============================== Clang 12: -O2 ..................... 6.206 |=================================== Clang 12: -O3 -march=native ....... 5.952 |================================= Clang 12: -O3 -march=native -flto . 5.870 |================================= Liquid-DSP 2021.01.31 Threads: 8 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better GCC 11.1: -O2 ..................... 635506667 |=========================== GCC 11.1: -O3 -march=native ....... 686530000 |============================== GCC 11.1: -O3 -march=native -flto . 684356667 |============================== Clang 12: -O2 ..................... 742070000 |================================ Clang 12: -O3 -march=native ....... 712080000 |=============================== Clang 12: -O3 -march=native -flto . 699980000 |============================== Liquid-DSP 2021.01.31 Threads: 16 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better GCC 11.1: -O2 ..................... 711343333 |============================ GCC 11.1: -O3 -march=native ....... 722893333 |============================ GCC 11.1: -O3 -march=native -flto . 722393333 |============================ Clang 12: -O2 ..................... 813766667 |================================ Clang 12: -O3 -march=native ....... 768436667 |============================== Clang 12: -O3 -march=native -flto . 753703333 |============================== libjpeg-turbo tjbench 2.1.0 Test: Decompression Throughput Megapixels/sec > Higher Is Better GCC 11.1: -O2 ..................... 261.03 |================================ GCC 11.1: -O3 -march=native ....... 273.10 |================================== GCC 11.1: -O3 -march=native -flto . 272.60 |================================== Clang 12: -O2 ..................... 273.30 |================================== Clang 12: -O3 -march=native ....... 282.71 |=================================== Clang 12: -O3 -march=native -flto . 282.67 |=================================== ASTC Encoder 2.4 Preset: Medium Seconds < Lower Is Better GCC 11.1: -O2 ..................... 5.2481 |=================================== GCC 11.1: -O3 -march=native ....... 5.1820 |=================================== GCC 11.1: -O3 -march=native -flto . 5.1705 |================================== Clang 12: -O2 ..................... 3.8832 |========================== Clang 12: -O3 -march=native ....... 3.7813 |========================= Clang 12: -O3 -march=native -flto . 3.7710 |========================= ASTC Encoder 2.4 Preset: Thorough Seconds < Lower Is Better GCC 11.1: -O2 ..................... 12.0949 |================================== GCC 11.1: -O3 -march=native ....... 11.3846 |================================ GCC 11.1: -O3 -march=native -flto . 11.3952 |================================ Clang 12: -O2 ..................... 10.5088 |============================== Clang 12: -O3 -march=native ....... 9.5559 |=========================== Clang 12: -O3 -march=native -flto . 9.5653 |=========================== ASTC Encoder 2.4 Preset: Exhaustive Seconds < Lower Is Better GCC 11.1: -O2 ..................... 91.38 |==================================== GCC 11.1: -O3 -march=native ....... 85.42 |================================== GCC 11.1: -O3 -march=native -flto . 85.42 |================================== Clang 12: -O2 ..................... 85.56 |================================== Clang 12: -O3 -march=native ....... 74.78 |============================= Clang 12: -O3 -march=native -flto . 74.80 |============================= SQLite Speedtest 3.30 Timed Time - Size 1,000 Seconds < Lower Is Better GCC 11.1: -O2 ..................... 43.62 |================================== GCC 11.1: -O3 -march=native ....... 44.09 |================================== GCC 11.1: -O3 -march=native -flto . 43.78 |================================== Clang 12: -O2 ..................... 46.26 |==================================== Clang 12: -O3 -march=native ....... 46.54 |==================================== Clang 12: -O3 -march=native -flto . 46.37 |==================================== NCNN 20201218 Target: CPU - Model: mobilenet ms < Lower Is Better GCC 11.1: -O2 ..................... 15.15 |==================================== GCC 11.1: -O3 -march=native ....... 11.83 |============================ GCC 11.1: -O3 -march=native -flto . 13.34 |================================ Clang 12: -O2 ..................... 12.60 |============================== Clang 12: -O3 -march=native ....... 12.05 |============================= Clang 12: -O3 -march=native -flto . 12.13 |============================= NCNN 20201218 Target: CPU-v2-v2 - Model: mobilenet-v2 ms < Lower Is Better GCC 11.1: -O2 ..................... 4.20 |===================================== GCC 11.1: -O3 -march=native ....... 3.24 |============================= GCC 11.1: -O3 -march=native -flto . 3.25 |============================= Clang 12: -O2 ..................... 3.45 |============================== Clang 12: -O3 -march=native ....... 3.36 |============================== Clang 12: -O3 -march=native -flto . 3.28 |============================= NCNN 20201218 Target: CPU-v3-v3 - Model: mobilenet-v3 ms < Lower Is Better GCC 11.1: -O2 ..................... 3.18 |===================================== GCC 11.1: -O3 -march=native ....... 2.55 |============================== GCC 11.1: -O3 -march=native -flto . 2.52 |============================= Clang 12: -O2 ..................... 2.64 |=============================== Clang 12: -O3 -march=native ....... 2.56 |============================== Clang 12: -O3 -march=native -flto . 2.46 |============================= NCNN 20201218 Target: CPU - Model: mnasnet ms < Lower Is Better GCC 11.1: -O2 ..................... 3.11 |===================================== GCC 11.1: -O3 -march=native ....... 2.30 |=========================== GCC 11.1: -O3 -march=native -flto . 2.27 |=========================== Clang 12: -O2 ..................... 2.39 |============================ Clang 12: -O3 -march=native ....... 2.33 |============================ Clang 12: -O3 -march=native -flto . 2.23 |=========================== NCNN 20201218 Target: CPU - Model: efficientnet-b0 ms < Lower Is Better GCC 11.1: -O2 ..................... 5.23 |===================================== GCC 11.1: -O3 -march=native ....... 4.38 |=============================== GCC 11.1: -O3 -march=native -flto . 4.32 |=============================== Clang 12: -O2 ..................... 4.46 |================================ Clang 12: -O3 -march=native ....... 4.36 |=============================== Clang 12: -O3 -march=native -flto . 4.39 |=============================== NCNN 20201218 Target: CPU - Model: googlenet ms < Lower Is Better GCC 11.1: -O2 ..................... 11.11 |==================================== GCC 11.1: -O3 -march=native ....... 10.20 |================================= GCC 11.1: -O3 -march=native -flto . 10.27 |================================= Clang 12: -O2 ..................... 10.86 |=================================== Clang 12: -O3 -march=native ....... 10.58 |================================== Clang 12: -O3 -march=native -flto . 10.53 |================================== NCNN 20201218 Target: CPU - Model: vgg16 ms < Lower Is Better GCC 11.1: -O2 ..................... 54.80 |=================================== GCC 11.1: -O3 -march=native ....... 54.50 |=================================== GCC 11.1: -O3 -march=native -flto . 54.13 |=================================== Clang 12: -O2 ..................... 55.66 |==================================== Clang 12: -O3 -march=native ....... 54.40 |=================================== Clang 12: -O3 -march=native -flto . 54.38 |=================================== NCNN 20201218 Target: CPU - Model: resnet18 ms < Lower Is Better GCC 11.1: -O2 ..................... 11.30 |=================================== GCC 11.1: -O3 -march=native ....... 11.08 |=================================== GCC 11.1: -O3 -march=native -flto . 11.39 |==================================== Clang 12: -O2 ..................... 11.50 |==================================== Clang 12: -O3 -march=native ....... 11.22 |=================================== Clang 12: -O3 -march=native -flto . 11.19 |=================================== NCNN 20201218 Target: CPU - Model: alexnet ms < Lower Is Better GCC 11.1: -O2 ..................... 9.63 |=================================== GCC 11.1: -O3 -march=native ....... 9.63 |=================================== GCC 11.1: -O3 -march=native -flto . 9.70 |=================================== Clang 12: -O2 ..................... 10.01 |==================================== Clang 12: -O3 -march=native ....... 9.91 |==================================== Clang 12: -O3 -march=native -flto . 9.86 |=================================== NCNN 20201218 Target: CPU - Model: resnet50 ms < Lower Is Better GCC 11.1: -O2 ..................... 22.07 |==================================== GCC 11.1: -O3 -march=native ....... 18.23 |============================== GCC 11.1: -O3 -march=native -flto . 18.43 |============================== Clang 12: -O2 ..................... 19.06 |=============================== Clang 12: -O3 -march=native ....... 18.33 |============================== Clang 12: -O3 -march=native -flto . 18.31 |============================== NCNN 20201218 Target: CPU - Model: squeezenet_ssd ms < Lower Is Better GCC 11.1: -O2 ..................... 16.15 |==================================== GCC 11.1: -O3 -march=native ....... 15.53 |=================================== GCC 11.1: -O3 -march=native -flto . 15.92 |=================================== Clang 12: -O2 ..................... 15.34 |================================== Clang 12: -O3 -march=native ....... 15.35 |================================== Clang 12: -O3 -march=native -flto . 15.46 |================================== NCNN 20201218 Target: CPU - Model: regnety_400m ms < Lower Is Better GCC 11.1: -O2 ..................... 9.61 |===================================== GCC 11.1: -O3 -march=native ....... 8.62 |================================= GCC 11.1: -O3 -march=native -flto . 8.91 |================================== Clang 12: -O2 ..................... 9.55 |===================================== Clang 12: -O3 -march=native ....... 9.20 |=================================== Clang 12: -O3 -march=native -flto . 8.99 |=================================== TNN 0.2.3 Target: CPU - Model: MobileNet v2 ms < Lower Is Better GCC 11.1: -O2 ..................... 243.42 |========================= GCC 11.1: -O3 -march=native ....... 230.02 |======================= GCC 11.1: -O3 -march=native -flto . 247.89 |========================= Clang 12: -O2 ..................... 308.51 |=============================== Clang 12: -O3 -march=native ....... 336.91 |================================== Clang 12: -O3 -march=native -flto . 342.86 |=================================== TNN 0.2.3 Target: CPU - Model: SqueezeNet v1.1 ms < Lower Is Better GCC 11.1: -O2 ..................... 236.05 |================================ GCC 11.1: -O3 -march=native ....... 227.66 |=============================== GCC 11.1: -O3 -march=native -flto . 242.55 |================================= Clang 12: -O2 ..................... 239.53 |================================ Clang 12: -O3 -march=native ....... 259.57 |=================================== Clang 12: -O3 -march=native -flto . 258.99 |===================================