EPYC 7763 LLVM Clang Compiler Tests AMD EPYC 7763 64-Core testing with a Supermicro H12SSL-i v1.01 (2.0 BIOS) and ASPEED on Ubuntu 20.04 via the Phoronix Test Suite. Clang 12.0: Processor: AMD EPYC 7763 64-Core @ 2.45GHz (64 Cores / 128 Threads), Motherboard: Supermicro H12SSL-i v1.01 (2.0 BIOS), Chipset: AMD Starship/Matisse, Memory: 126GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP, Graphics: ASPEED, Network: 2 x Broadcom NetXtreme BCM5720 2-port PCIe OS: Ubuntu 20.04, Kernel: 5.12.0-051200rc6daily20210408-generic (x86_64) 20210407, Desktop: GNOME Shell 3.36.4, Display Server: X Server 1.20.8, Compiler: Clang 12.0.0-++20210409092622+fa0971b87fb2-1~exp1~20210409193326.73, File-System: ext4, Screen Resolution: 1024x768 Clang 11.0: Processor: AMD EPYC 7763 64-Core @ 2.45GHz (64 Cores / 128 Threads), Motherboard: Supermicro H12SSL-i v1.01 (2.0 BIOS), Chipset: AMD Starship/Matisse, Memory: 126GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP, Graphics: ASPEED, Network: 2 x Broadcom NetXtreme BCM5720 2-port PCIe OS: Ubuntu 20.04, Kernel: 5.12.0-051200rc6daily20210408-generic (x86_64) 20210407, Desktop: GNOME Shell 3.36.4, Display Server: X Server 1.20.8, Compiler: Clang 11.0.0-2~ubuntu20.04.1, File-System: ext4, Screen Resolution: 1024x768 Clang 12.0 LTO: Processor: AMD EPYC 7763 64-Core @ 2.45GHz (64 Cores / 128 Threads), Motherboard: Supermicro H12SSL-i v1.01 (2.0 BIOS), Chipset: AMD Starship/Matisse, Memory: 126GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP, Graphics: ASPEED, Network: 2 x Broadcom NetXtreme BCM5720 2-port PCIe OS: Ubuntu 20.04, Kernel: 5.12.0-051200rc6daily20210408-generic (x86_64) 20210407, Desktop: GNOME Shell 3.36.4, Display Server: X Server 1.20.8, Compiler: Clang 12.0.0-++20210409092622+fa0971b87fb2-1~exp1~20210409193326.73, File-System: ext4, Screen Resolution: 1024x768 GCC 9.3: Processor: AMD EPYC 7763 64-Core @ 2.45GHz (64 Cores / 128 Threads), Motherboard: Supermicro H12SSL-i v1.01 (2.0 BIOS), Chipset: AMD Starship/Matisse, Memory: 126GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP, Graphics: ASPEED, Network: 2 x Broadcom NetXtreme BCM5720 2-port PCIe OS: Ubuntu 20.04, Kernel: 5.12.0-051200rc6daily20210408-generic (x86_64) 20210407, Desktop: GNOME Shell 3.36.4, Display Server: X Server 1.20.8, Compiler: GCC 9.3.0, File-System: ext4, Screen Resolution: 1024x768 GCC 10.3: Processor: AMD EPYC 7763 64-Core @ 2.45GHz (64 Cores / 128 Threads), Motherboard: Supermicro H12SSL-i v1.01 (2.0 BIOS), Chipset: AMD Starship/Matisse, Memory: 126GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP, Graphics: ASPEED, Network: 2 x Broadcom NetXtreme BCM5720 2-port PCIe OS: Ubuntu 20.04, Kernel: 5.12.0-051200rc6daily20210408-generic (x86_64) 20210407, Desktop: GNOME Shell 3.36.4, Display Server: X Server 1.20.8, Compiler: GCC 10.3.0, File-System: ext4, Screen Resolution: 1024x768 GCC 11.0.1: Processor: AMD EPYC 7763 64-Core @ 2.45GHz (64 Cores / 128 Threads), Motherboard: Supermicro H12SSL-i v1.01 (2.0 BIOS), Chipset: AMD Starship/Matisse, Memory: 126GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP, Graphics: ASPEED, Network: 2 x Broadcom NetXtreme BCM5720 2-port PCIe OS: Ubuntu 20.04, Kernel: 5.12.0-051200rc6daily20210408-generic (x86_64) 20210407, Desktop: GNOME Shell 3.36.4, Display Server: X Server 1.20.8, Compiler: GCC 11.0.1 20210413, File-System: ext4, Screen Resolution: 1024x768 AMD AOCC 3.0: Processor: AMD EPYC 7763 64-Core @ 2.45GHz (64 Cores / 128 Threads), Motherboard: Supermicro H12SSL-i v1.01 (2.0 BIOS), Chipset: AMD Starship/Matisse, Memory: 126GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP, Graphics: ASPEED, Network: 2 x Broadcom NetXtreme BCM5720 2-port PCIe OS: Ubuntu 20.04, Kernel: 5.12.0-051200rc6daily20210408-generic (x86_64) 20210407, Desktop: GNOME Shell 3.36.4, Display Server: X Server 1.20.8, Compiler: Clang 12.0.0, File-System: ext4, Screen Resolution: 1024x768 oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU ms < Lower Is Better Clang 12.0 ... 1.44425 |=========== Clang 11.0 ... 1.45757 |=========== GCC 9.3 ...... 7.19213 |======================================================= GCC 10.3 ..... 7.23686 |======================================================= AMD AOCC 3.0 . 1.37059 |========== ViennaCL 1.7.1 Test: CPU BLAS - dCOPY GB/s > Higher Is Better Clang 12.0 ... 604.0 |================= Clang 11.0 ... 1877.0 |====================================================== GCC 9.3 ...... 1587.0 |============================================== GCC 10.3 ..... 1461.2 |========================================== GCC 11.0.1 ... 1599.0 |============================================== AMD AOCC 3.0 . 1944.0 |======================================================== ViennaCL 1.7.1 Test: CPU BLAS - dAXPY GB/s > Higher Is Better Clang 12.0 ... 878.0 |===================== Clang 11.0 ... 1043.0 |========================= GCC 9.3 ...... 1521.0 |==================================== GCC 10.3 ..... 2158.4 |=================================================== GCC 11.0.1 ... 2359.0 |======================================================== AMD AOCC 3.0 . 1017.0 |======================== Etcpak 0.7 Configuration: DXT1 Mpx/s > Higher Is Better Clang 12.0 ..... 2718.53 |===================================================== Clang 11.0 ..... 1872.76 |==================================== Clang 12.0 LTO . 2719.99 |===================================================== GCC 9.3 ........ 1082.37 |===================== GCC 10.3 ....... 1114.60 |====================== AMD AOCC 3.0 ... 2654.72 |==================================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NN GFLOPs/s > Higher Is Better Clang 12.0 ... 48.6 |============================ Clang 11.0 ... 83.6 |=============================================== GCC 9.3 ...... 98.5 |======================================================== GCC 10.3 ..... 98.7 |======================================================== GCC 11.0.1 ... 100.5 |========================================================= AMD AOCC 3.0 . 84.0 |================================================ ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TN GFLOPs/s > Higher Is Better Clang 12.0 ... 51.9 |============================ Clang 11.0 ... 88.3 |================================================ GCC 9.3 ...... 100.9 |======================================================= GCC 10.3 ..... 104.0 |========================================================= GCC 11.0.1 ... 104.0 |========================================================= AMD AOCC 3.0 . 90.0 |================================================= dav1d 0.8.2 Video Input: Chimera 1080p 10-bit FPS > Higher Is Better Clang 12.0 ... 308.32 |==================================================== Clang 11.0 ... 184.19 |=============================== GCC 9.3 ...... 305.36 |=================================================== GCC 10.3 ..... 316.14 |===================================================== GCC 11.0.1 ... 334.35 |======================================================== AMD AOCC 3.0 . 192.00 |================================ GraphicsMagick 1.3.33 Operation: Resizing Iterations Per Minute > Higher Is Better Clang 12.0 ... 2136 |========================================================== Clang 11.0 ... 2034 |======================================================= GCC 9.3 ...... 1238 |================================== GCC 10.3 ..... 1208 |================================= GCC 11.0.1 ... 1188 |================================ AMD AOCC 3.0 . 1866 |=================================================== Botan 2.17.3 Test: ChaCha20Poly1305 - Decrypt MiB/s > Higher Is Better Clang 12.0 ... 843.40 |======================================================== Clang 11.0 ... 840.64 |======================================================== GCC 9.3 ...... 611.98 |========================================= GCC 10.3 ..... 476.18 |================================ AMD AOCC 3.0 . 838.09 |======================================================== C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel Seconds < Lower Is Better Clang 12.0 ... 15.870 |======================================================== Clang 11.0 ... 15.599 |======================================================= GCC 9.3 ...... 9.158 |================================ GCC 10.3 ..... 9.029 |================================ GCC 11.0.1 ... 9.227 |================================= AMD AOCC 3.0 . 15.649 |======================================================= Botan 2.17.3 Test: ChaCha20Poly1305 MiB/s > Higher Is Better Clang 12.0 ... 850.50 |======================================================== Clang 11.0 ... 848.24 |======================================================== GCC 9.3 ...... 616.10 |========================================= GCC 10.3 ..... 485.02 |================================ AMD AOCC 3.0 . 845.14 |======================================================== oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU ms < Lower Is Better Clang 12.0 ... 1.172580 |====================================================== Clang 11.0 ... 1.151400 |===================================================== GCC 9.3 ...... 0.717782 |================================= GCC 10.3 ..... 0.788192 |==================================== AMD AOCC 3.0 . 1.170440 |====================================================== LibRaw 0.20 Post-Processing Benchmark Mpix/sec > Higher Is Better Clang 12.0 ... 41.78 |======================================== Clang 11.0 ... 38.71 |===================================== GCC 9.3 ...... 60.20 |========================================================= GCC 10.3 ..... 58.90 |======================================================== GCC 11.0.1 ... 57.24 |====================================================== AMD AOCC 3.0 . 41.64 |======================================= FinanceBench 2016-07-25 Benchmark: Bonds OpenMP ms < Lower Is Better Clang 12.0 ... 51596.87 |==================================== Clang 11.0 ... 51900.43 |==================================== GCC 9.3 ...... 76805.58 |====================================================== GCC 10.3 ..... 51770.51 |==================================== GCC 11.0.1 ... 51376.82 |==================================== AMD AOCC 3.0 . 51885.52 |==================================== oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU ms < Lower Is Better Clang 12.0 ... 1.221320 |====================================================== Clang 11.0 ... 0.841169 |===================================== GCC 9.3 ...... 0.869308 |====================================== GCC 10.3 ..... 0.870784 |======================================= AMD AOCC 3.0 . 0.833921 |===================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NT GFLOPs/s > Higher Is Better Clang 12.0 ... 65.7 |======================================== Clang 11.0 ... 79.3 |================================================ GCC 9.3 ...... 95.3 |========================================================== GCC 10.3 ..... 94.4 |========================================================= GCC 11.0.1 ... 95.0 |========================================================== AMD AOCC 3.0 . 78.8 |================================================ ViennaCL 1.7.1 Test: CPU BLAS - dDOT GB/s > Higher Is Better Clang 12.0 ... 819.00 |======================================= Clang 11.0 ... 933.00 |============================================ GCC 9.3 ...... 1133.00 |===================================================== GCC 10.3 ..... 1056.42 |================================================== GCC 11.0.1 ... 1153.00 |====================================================== AMD AOCC 3.0 . 1165.00 |======================================================= SVT-AV1 0.8 Encoder Mode: Enc Mode 0 - Input: 1080p Frames Per Second > Higher Is Better Clang 12.0 ... 0.183 |========================================================= Clang 11.0 ... 0.181 |======================================================== GCC 9.3 ...... 0.129 |======================================== GCC 10.3 ..... 0.169 |===================================================== GCC 11.0.1 ... 0.176 |======================================================= AMD AOCC 3.0 . 0.183 |========================================================= toyBrot Fractal Generator 2020-11-18 Implementation: C++ Threads ms < Lower Is Better Clang 12.0 ..... 7220 |======================================================== Clang 11.0 ..... 6395 |================================================== Clang 12.0 LTO . 7143 |======================================================= GCC 9.3 ........ 5142 |======================================== GCC 10.3 ....... 5383 |========================================== AMD AOCC 3.0 ... 7144 |======================================================= Etcpak 0.7 Configuration: ETC1 Mpx/s > Higher Is Better Clang 12.0 ..... 284.64 |====================================================== Clang 11.0 ..... 205.07 |======================================= Clang 12.0 LTO . 284.76 |====================================================== GCC 9.3 ........ 269.67 |=================================================== GCC 10.3 ....... 281.15 |===================================================== AMD AOCC 3.0 ... 211.73 |======================================== toyBrot Fractal Generator 2020-11-18 Implementation: TBB ms < Lower Is Better Clang 12.0 ..... 6780 |====================================================== Clang 11.0 ..... 6247 |================================================= Clang 12.0 LTO . 7085 |======================================================== GCC 9.3 ........ 5107 |======================================== GCC 10.3 ....... 5181 |========================================= AMD AOCC 3.0 ... 6945 |======================================================= toyBrot Fractal Generator 2020-11-18 Implementation: OpenMP ms < Lower Is Better Clang 12.0 ... 7507 |========================================================== Clang 11.0 ... 7029 |====================================================== GCC 9.3 ...... 5451 |========================================== GCC 10.3 ..... 5524 |=========================================== AMD AOCC 3.0 . 7477 |========================================================== toyBrot Fractal Generator 2020-11-18 Implementation: C++ Tasks ms < Lower Is Better Clang 12.0 ..... 7437 |======================================================== Clang 11.0 ..... 6836 |=================================================== Clang 12.0 LTO . 7367 |======================================================= GCC 9.3 ........ 5414 |========================================= GCC 10.3 ....... 5610 |========================================== AMD AOCC 3.0 ... 7189 |====================================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TT GFLOPs/s > Higher Is Better Clang 12.0 ... 73.0 |=========================================== Clang 11.0 ... 84.0 |================================================= GCC 9.3 ...... 97.9 |========================================================= GCC 10.3 ..... 98.5 |========================================================== GCC 11.0.1 ... 99.3 |========================================================== AMD AOCC 3.0 . 84.4 |================================================= SciMark 2.0 Computational Test: Sparse Matrix Multiply Mflops > Higher Is Better Clang 12.0 ... 4280.22 |=================================================== Clang 11.0 ... 4590.37 |======================================================= GCC 9.3 ...... 3765.88 |============================================= GCC 10.3 ..... 3820.77 |============================================== GCC 11.0.1 ... 3462.66 |========================================= AMD AOCC 3.0 . 4594.27 |======================================================= Botan 2.17.3 Test: Blowfish MiB/s > Higher Is Better Clang 12.0 ... 380.05 |================================================== Clang 11.0 ... 319.23 |========================================== GCC 9.3 ...... 412.85 |======================================================= GCC 10.3 ..... 422.14 |======================================================== AMD AOCC 3.0 . 319.79 |========================================== GraphicsMagick 1.3.33 Operation: Sharpen Iterations Per Minute > Higher Is Better Clang 12.0 ... 614 |============================================= Clang 11.0 ... 613 |============================================= GCC 9.3 ...... 806 |=========================================================== GCC 10.3 ..... 807 |=========================================================== GCC 11.0.1 ... 809 |=========================================================== AMD AOCC 3.0 . 617 |============================================= oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU ms < Lower Is Better Clang 12.0 ... 2.36797 |=========================================== Clang 11.0 ... 2.31859 |========================================== GCC 9.3 ...... 2.99759 |======================================================= GCC 10.3 ..... 3.00341 |======================================================= AMD AOCC 3.0 . 2.28755 |========================================== oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU ms < Lower Is Better Clang 12.0 ... 0.491940 |============================================ Clang 11.0 ... 0.489278 |============================================ GCC 9.3 ...... 0.599140 |====================================================== GCC 10.3 ..... 0.602155 |====================================================== AMD AOCC 3.0 . 0.459724 |========================================= GraphicsMagick 1.3.33 Operation: HWB Color Space Iterations Per Minute > Higher Is Better Clang 12.0 ... 605 |============================================= Clang 11.0 ... 616 |============================================== GCC 9.3 ...... 785 |=========================================================== GCC 10.3 ..... 772 |========================================================== GCC 11.0.1 ... 771 |========================================================== AMD AOCC 3.0 . 614 |============================================== oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU ms < Lower Is Better Clang 12.0 ... 0.710124 |====================================================== Clang 11.0 ... 0.594729 |============================================= GCC 9.3 ...... 0.654010 |================================================== GCC 10.3 ..... 0.646252 |================================================= AMD AOCC 3.0 . 0.554231 |========================================== FinanceBench 2016-07-25 Benchmark: Repo OpenMP ms < Lower Is Better Clang 12.0 ... 33246.84 |========================================== Clang 11.0 ... 33178.50 |========================================== GCC 9.3 ...... 42399.81 |====================================================== GCC 10.3 ..... 34979.29 |============================================= GCC 11.0.1 ... 34199.60 |============================================ AMD AOCC 3.0 . 33146.03 |========================================== SVT-AV1 0.8 Encoder Mode: Enc Mode 4 - Input: 1080p Frames Per Second > Higher Is Better Clang 12.0 ... 11.474 |====================================================== Clang 11.0 ... 11.821 |======================================================== GCC 9.3 ...... 9.325 |============================================ GCC 10.3 ..... 11.230 |===================================================== GCC 11.0.1 ... 11.905 |======================================================== AMD AOCC 3.0 . 11.690 |======================================================= oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU ms < Lower Is Better Clang 12.0 ... 2.03606 |======================================================= Clang 11.0 ... 1.60540 |=========================================== GCC 9.3 ...... 1.66260 |============================================= GCC 10.3 ..... 1.64268 |============================================ AMD AOCC 3.0 . 1.59597 |=========================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-T GB/s > Higher Is Better Clang 12.0 ... 626.0 |============================================= Clang 11.0 ... 677.0 |================================================ GCC 9.3 ...... 798.0 |========================================================= GCC 10.3 ..... 741.4 |===================================================== GCC 11.0.1 ... 794.0 |========================================================= AMD AOCC 3.0 . 783.0 |======================================================== SVT-AV1 0.8 Encoder Mode: Enc Mode 8 - Input: 1080p Frames Per Second > Higher Is Better Clang 12.0 ... 118.07 |======================================================== Clang 11.0 ... 117.39 |======================================================== GCC 9.3 ...... 92.98 |============================================ GCC 10.3 ..... 109.70 |==================================================== GCC 11.0.1 ... 110.70 |===================================================== AMD AOCC 3.0 . 116.49 |======================================================= Coremark 1.0 CoreMark Size 666 - Iterations Per Second Iterations/Sec > Higher Is Better Clang 12.0 ... 1785466.28 |=========================================== Clang 11.0 ... 1790837.01 |=========================================== GCC 9.3 ...... 2086609.98 |================================================== GCC 10.3 ..... 2110880.43 |================================================== GCC 11.0.1 ... 2176407.67 |==================================================== AMD AOCC 3.0 . 1720060.44 |========================================= ASTC Encoder 2.4 Preset: Medium Seconds < Lower Is Better Clang 12.0 ... 4.0058 |============================================== Clang 11.0 ... 3.9837 |============================================== GCC 9.3 ...... 4.8745 |======================================================== GCC 10.3 ..... 4.8699 |======================================================== GCC 11.0.1 ... 4.8160 |======================================================= AMD AOCC 3.0 . 3.8811 |============================================= oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU ms < Lower Is Better Clang 12.0 ... 0.313689 |============================================= Clang 11.0 ... 0.315522 |============================================= GCC 9.3 ...... 0.376992 |====================================================== GCC 10.3 ..... 0.377733 |====================================================== AMD AOCC 3.0 . 0.301885 |=========================================== FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 2048 Mflops > Higher Is Better Clang 12.0 ... 51254 |===================================================== Clang 11.0 ... 50084 |==================================================== GCC 9.3 ...... 52749 |======================================================= GCC 10.3 ..... 53497 |======================================================== GCC 11.0.1 ... 54710 |========================================================= AMD AOCC 3.0 . 44412 |============================================== Liquid-DSP 2021.01.31 Threads: 128 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better Clang 12.0 ... 3643766667 |==================================================== Clang 11.0 ... 3596533333 |=================================================== GCC 9.3 ...... 3012066667 |=========================================== GCC 10.3 ..... 3005033333 |=========================================== GCC 11.0.1 ... 3055766667 |============================================ AMD AOCC 3.0 . 3606466667 |=================================================== oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU ms < Lower Is Better Clang 12.0 ... 593.97 |================================================== Clang 11.0 ... 563.20 |================================================ GCC 9.3 ...... 658.66 |======================================================== GCC 10.3 ..... 659.27 |======================================================== AMD AOCC 3.0 . 544.10 |============================================== oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU ms < Lower Is Better Clang 12.0 ... 590.18 |================================================== Clang 11.0 ... 562.97 |================================================ GCC 9.3 ...... 659.19 |======================================================== GCC 10.3 ..... 658.28 |======================================================== AMD AOCC 3.0 . 544.31 |============================================== oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better Clang 12.0 ... 597.48 |=================================================== Clang 11.0 ... 563.25 |================================================ GCC 9.3 ...... 657.88 |======================================================== GCC 10.3 ..... 658.04 |======================================================== AMD AOCC 3.0 . 544.60 |============================================== SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation Mflops > Higher Is Better Clang 12.0 ... 1785.50 |============================================== Clang 11.0 ... 1785.42 |============================================== GCC 9.3 ...... 2149.15 |======================================================= GCC 10.3 ..... 2038.15 |==================================================== GCC 11.0.1 ... 2148.84 |======================================================= AMD AOCC 3.0 . 1785.45 |============================================== GraphicsMagick 1.3.33 Operation: Noise-Gaussian Iterations Per Minute > Higher Is Better Clang 12.0 ... 457 |================================================= Clang 11.0 ... 463 |================================================== GCC 9.3 ...... 547 |=========================================================== GCC 10.3 ..... 544 |========================================================== GCC 11.0.1 ... 550 |=========================================================== AMD AOCC 3.0 . 466 |================================================== ONNX Runtime 1.6 Model: shufflenet-v2-10 - Device: OpenMP CPU Inferences Per Minute > Higher Is Better Clang 12.0 ... 9904 |================================================== Clang 11.0 ... 9797 |================================================= GCC 9.3 ...... 9419 |=============================================== GCC 10.3 ..... 10197 |=================================================== AMD AOCC 3.0 . 11325 |========================================================= Botan 2.17.3 Test: Blowfish - Decrypt MiB/s > Higher Is Better Clang 12.0 ... 351.28 |=============================================== Clang 11.0 ... 351.08 |=============================================== GCC 9.3 ...... 412.07 |======================================================= GCC 10.3 ..... 420.85 |======================================================== AMD AOCC 3.0 . 355.06 |=============================================== Etcpak 0.7 Configuration: ETC2 Mpx/s > Higher Is Better Clang 12.0 ..... 202.09 |====================================================== Clang 11.0 ..... 168.82 |============================================= Clang 12.0 LTO . 202.10 |====================================================== GCC 9.3 ........ 174.81 |=============================================== GCC 10.3 ....... 173.23 |============================================== AMD AOCC 3.0 ... 178.85 |================================================ Botan 2.17.3 Test: AES-256 MiB/s > Higher Is Better Clang 12.0 ... 4659.34 |============================================== Clang 11.0 ... 4901.13 |================================================= GCC 9.3 ...... 5484.68 |======================================================= GCC 10.3 ..... 5525.71 |======================================================= AMD AOCC 3.0 . 4891.07 |================================================= ASTC Encoder 2.4 Preset: Thorough Seconds < Lower Is Better Clang 12.0 ... 6.7647 |================================================ Clang 11.0 ... 6.7674 |================================================ GCC 9.3 ...... 7.8537 |======================================================== GCC 10.3 ..... 7.8370 |======================================================== GCC 11.0.1 ... 7.6989 |======================================================= AMD AOCC 3.0 . 6.6409 |=============================================== FLAC Audio Encoding 1.3.2 WAV To FLAC Seconds < Lower Is Better Clang 12.0 ... 7.854 |================================================ Clang 11.0 ... 7.979 |================================================= GCC 9.3 ...... 8.534 |==================================================== GCC 10.3 ..... 8.567 |===================================================== GCC 11.0.1 ... 8.709 |===================================================== AMD AOCC 3.0 . 9.280 |========================================================= Botan 2.17.3 Test: AES-256 - Decrypt MiB/s > Higher Is Better Clang 12.0 ... 4682.46 |=============================================== Clang 11.0 ... 4895.56 |================================================= GCC 9.3 ...... 5391.99 |====================================================== GCC 10.3 ..... 5529.40 |======================================================= AMD AOCC 3.0 . 4887.57 |================================================= LAME MP3 Encoding 3.100 WAV To MP3 Seconds < Lower Is Better Clang 12.0 ... 8.256 |========================================================= Clang 11.0 ... 8.250 |========================================================= GCC 9.3 ...... 7.011 |================================================ GCC 10.3 ..... 7.231 |================================================== GCC 11.0.1 ... 7.473 |==================================================== AMD AOCC 3.0 . 8.142 |======================================================== TSCP 1.81 AI Chess Performance Nodes Per Second > Higher Is Better Clang 12.0 ... 1570966 |=================================================== Clang 11.0 ... 1638265 |===================================================== GCC 9.3 ...... 1446372 |=============================================== GCC 10.3 ..... 1467179 |================================================ GCC 11.0.1 ... 1494250 |================================================ AMD AOCC 3.0 . 1697846 |======================================================= GraphicsMagick 1.3.33 Operation: Enhanced Iterations Per Minute > Higher Is Better Clang 12.0 ... 1076 |=================================================== Clang 11.0 ... 1068 |=================================================== GCC 9.3 ...... 1217 |========================================================== GCC 10.3 ..... 1039 |================================================== GCC 11.0.1 ... 1082 |==================================================== AMD AOCC 3.0 . 1057 |================================================== Ngspice 34 Circuit: C2670 Seconds < Lower Is Better Clang 12.0 ... 118.87 |======================================================== Clang 11.0 ... 103.83 |================================================= GCC 9.3 ...... 101.54 |================================================ GCC 10.3 ..... 103.60 |================================================= GCC 11.0.1 ... 103.01 |================================================= AMD AOCC 3.0 . 103.93 |================================================= simdjson 0.8.2 Throughput Test: PartialTweets GB/s > Higher Is Better Clang 12.0 ... 4.60 |========================================================== Clang 11.0 ... 4.41 |======================================================== GCC 9.3 ...... 3.93 |================================================== GCC 10.3 ..... 4.02 |=================================================== AMD AOCC 3.0 . 4.33 |======================================================= QuantLib 1.21 MFLOPS > Higher Is Better Clang 12.0 ..... 2653.8 |===================================================== Clang 11.0 ..... 2640.2 |==================================================== Clang 12.0 LTO . 2657.8 |===================================================== GCC 9.3 ........ 2338.9 |============================================== GCC 10.3 ....... 2392.6 |=============================================== AMD AOCC 3.0 ... 2725.7 |====================================================== simdjson 0.8.2 Throughput Test: DistinctUserID GB/s > Higher Is Better Clang 12.0 ... 4.62 |========================================================== Clang 11.0 ... 4.41 |======================================================= GCC 9.3 ...... 3.98 |================================================== GCC 10.3 ..... 4.13 |==================================================== AMD AOCC 3.0 . 4.47 |======================================================== simdjson 0.8.2 Throughput Test: LargeRandom GB/s > Higher Is Better Clang 12.0 ... 0.84 |==================================================== Clang 11.0 ... 0.81 |================================================== GCC 9.3 ...... 0.94 |========================================================== GCC 10.3 ..... 0.90 |======================================================== AMD AOCC 3.0 . 0.82 |=================================================== ONNX Runtime 1.6 Model: yolov4 - Device: OpenMP CPU Inferences Per Minute > Higher Is Better Clang 12.0 ... 333 |=================================================== Clang 11.0 ... 346 |===================================================== GCC 9.3 ...... 351 |====================================================== GCC 10.3 ..... 351 |====================================================== AMD AOCC 3.0 . 386 |=========================================================== libavif avifenc 0.9.0 Encoder Speed: 6, Lossless Seconds < Lower Is Better Clang 12.0 ... 25.22 |================================================= Clang 11.0 ... 26.03 |=================================================== GCC 9.3 ...... 29.08 |========================================================= GCC 10.3 ..... 26.91 |===================================================== GCC 11.0.1 ... 27.06 |===================================================== AMD AOCC 3.0 . 25.78 |=================================================== FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 4096 Mflops > Higher Is Better Clang 12.0 ... 45428 |================================================== Clang 11.0 ... 46676 |=================================================== GCC 9.3 ...... 52099 |========================================================= GCC 10.3 ..... 52130 |========================================================= GCC 11.0.1 ... 51391 |======================================================== AMD AOCC 3.0 . 45521 |================================================== oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU ms < Lower Is Better Clang 12.0 ... 1.07507 |================================================= Clang 11.0 ... 1.07577 |================================================= GCC 9.3 ...... 1.17434 |====================================================== GCC 10.3 ..... 1.19747 |======================================================= AMD AOCC 3.0 . 1.04484 |================================================ FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 32 Mflops > Higher Is Better Clang 12.0 ... 13333 |===================================================== Clang 11.0 ... 13324 |===================================================== GCC 9.3 ...... 14399 |========================================================= GCC 10.3 ..... 12576 |================================================== GCC 11.0.1 ... 12765 |=================================================== AMD AOCC 3.0 . 13192 |==================================================== Botan 2.17.3 Test: Twofish MiB/s > Higher Is Better Clang 12.0 ... 315.41 |==================================================== Clang 11.0 ... 299.21 |================================================= GCC 9.3 ...... 337.36 |======================================================= GCC 10.3 ..... 341.85 |======================================================== AMD AOCC 3.0 . 305.00 |================================================== FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 32 Mflops > Higher Is Better Clang 12.0 ... 15649 |====================================================== Clang 11.0 ... 14590 |================================================== GCC 9.3 ...... 16590 |========================================================= GCC 10.3 ..... 16650 |========================================================= GCC 11.0.1 ... 16590 |========================================================= AMD AOCC 3.0 . 16146 |======================================================= oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU ms < Lower Is Better Clang 12.0 ... 1.07701 |================================================== Clang 11.0 ... 1.08011 |================================================== GCC 9.3 ...... 1.17486 |======================================================= GCC 10.3 ..... 1.17894 |======================================================= AMD AOCC 3.0 . 1.03899 |================================================ WebP Image Encode 1.1 Encode Settings: Quality 100, Highest Compression Encode Time - Seconds < Lower Is Better Clang 12.0 ... 6.309 |=================================================== Clang 11.0 ... 6.243 |================================================== GCC 9.3 ...... 7.053 |========================================================= GCC 10.3 ..... 7.078 |========================================================= GCC 11.0.1 ... 7.003 |======================================================== AMD AOCC 3.0 . 6.578 |===================================================== ONNX Runtime 1.6 Model: fcn-resnet101-11 - Device: OpenMP CPU Inferences Per Minute > Higher Is Better Clang 12.0 ... 112 |====================================================== Clang 11.0 ... 108 |==================================================== GCC 9.3 ...... 116 |======================================================== GCC 10.3 ..... 115 |======================================================== AMD AOCC 3.0 . 122 |=========================================================== GraphicsMagick 1.3.33 Operation: Swirl Iterations Per Minute > Higher Is Better Clang 12.0 ... 1993 |===================================================== Clang 11.0 ... 1915 |=================================================== GCC 9.3 ...... 2129 |========================================================= GCC 10.3 ..... 2112 |========================================================= GCC 11.0.1 ... 2161 |========================================================== AMD AOCC 3.0 . 1929 |==================================================== Liquid-DSP 2021.01.31 Threads: 1 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better Clang 12.0 ... 55663000 |================================================ Clang 11.0 ... 56307000 |================================================= GCC 9.3 ...... 61404000 |===================================================== GCC 10.3 ..... 62467333 |====================================================== GCC 11.0.1 ... 60886333 |===================================================== AMD AOCC 3.0 . 57411333 |================================================== Botan 2.17.3 Test: Twofish - Decrypt MiB/s > Higher Is Better Clang 12.0 ... 321.19 |===================================================== Clang 11.0 ... 302.41 |================================================== GCC 9.3 ...... 339.07 |======================================================== GCC 10.3 ..... 325.39 |====================================================== AMD AOCC 3.0 . 303.81 |================================================== oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU ms < Lower Is Better Clang 12.0 ... 3.28507 |================================================= Clang 11.0 ... 3.52787 |===================================================== GCC 9.3 ...... 3.67278 |======================================================= GCC 10.3 ..... 3.61144 |====================================================== AMD AOCC 3.0 . 3.41583 |=================================================== FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 4096 Mflops > Higher Is Better Clang 12.0 ... 9862.0 |=================================================== Clang 11.0 ... 9438.6 |================================================= GCC 9.3 ...... 10548.0 |======================================================= GCC 10.3 ..... 10179.0 |===================================================== GCC 11.0.1 ... 10205.0 |===================================================== AMD AOCC 3.0 . 9603.2 |================================================== FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 1024 Mflops > Higher Is Better Clang 12.0 ... 9088.3 |==================================================== Clang 11.0 ... 8809.6 |================================================== GCC 9.3 ...... 9798.6 |======================================================== GCC 10.3 ..... 9247.3 |===================================================== GCC 11.0.1 ... 9238.4 |===================================================== AMD AOCC 3.0 . 8902.1 |=================================================== SecureMark 1.0.4 Benchmark: SecureMark-TLS marks > Higher Is Better Clang 12.0 ... 265204 |======================================================== Clang 11.0 ... 260119 |======================================================= GCC 9.3 ...... 238935 |================================================== GCC 10.3 ..... 242700 |=================================================== GCC 11.0.1 ... 243861 |=================================================== AMD AOCC 3.0 . 264637 |======================================================== AOM AV1 3.0 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 1080p Frames Per Second > Higher Is Better Clang 12.0 . 103.17 |====================================================== Clang 11.0 . 100.55 |==================================================== GCC 9.3 .... 106.55 |======================================================== GCC 10.3 ... 107.46 |======================================================== GCC 11.0.1 . 111.27 |========================================================== WebP2 Image Encode 20210126 Encode Settings: Quality 100, Compression Effort 5 Seconds < Lower Is Better Clang 12.0 ... 6.690 |==================================================== Clang 11.0 ... 7.366 |========================================================= GCC 9.3 ...... 6.753 |==================================================== GCC 10.3 ..... 6.934 |===================================================== AMD AOCC 3.0 . 7.403 |========================================================= FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 1024 Mflops > Higher Is Better Clang 12.0 ... 10805 |===================================================== Clang 11.0 ... 10564 |==================================================== GCC 9.3 ...... 11689 |========================================================= GCC 10.3 ..... 11319 |======================================================= GCC 11.0.1 ... 11044 |====================================================== AMD AOCC 3.0 . 10669 |==================================================== PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 100 - Mode: Read Write - Average Latency ms < Lower Is Better Clang 12.0 . 1.607 |===================================================== Clang 11.0 . 1.626 |====================================================== GCC 9.3 .... 1.688 |======================================================== GCC 10.3 ... 1.701 |======================================================== GCC 11.0.1 . 1.777 |=========================================================== PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 100 - Mode: Read Write TPS > Higher Is Better Clang 12.0 . 62319 |=========================================================== Clang 11.0 . 61616 |========================================================== GCC 9.3 .... 59364 |======================================================== GCC 10.3 ... 58894 |======================================================== GCC 11.0.1 . 56369 |===================================================== FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 2048 Mflops > Higher Is Better Clang 12.0 ... 10467.0 |==================================================== Clang 11.0 ... 10004.2 |================================================== GCC 9.3 ...... 11053.0 |======================================================= GCC 10.3 ..... 10711.0 |===================================================== GCC 11.0.1 ... 10675.0 |===================================================== AMD AOCC 3.0 . 10227.0 |=================================================== libavif avifenc 0.9.0 Encoder Speed: 2 Seconds < Lower Is Better Clang 12.0 ... 25.18 |==================================================== Clang 11.0 ... 25.47 |==================================================== GCC 9.3 ...... 27.78 |========================================================= GCC 10.3 ..... 27.39 |======================================================== GCC 11.0.1 ... 27.10 |======================================================== AMD AOCC 3.0 . 25.60 |===================================================== Liquid-DSP 2021.01.31 Threads: 32 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better Clang 12.0 ... 1564833333 |=============================================== Clang 11.0 ... 1578400000 |================================================ GCC 9.3 ...... 1721900000 |==================================================== GCC 10.3 ..... 1718000000 |==================================================== GCC 11.0.1 ... 1679800000 |=================================================== AMD AOCC 3.0 . 1609633333 |================================================= FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 4096 Mflops > Higher Is Better Clang 12.0 ... 22797 |==================================================== Clang 11.0 ... 22913 |==================================================== GCC 9.3 ...... 25068 |========================================================= GCC 10.3 ..... 23774 |====================================================== GCC 11.0.1 ... 24888 |========================================================= AMD AOCC 3.0 . 23111 |===================================================== SciMark 2.0 Computational Test: Fast Fourier Transform Mflops > Higher Is Better Clang 12.0 ... 363.85 |=================================================== Clang 11.0 ... 399.16 |======================================================== GCC 9.3 ...... 384.03 |====================================================== GCC 10.3 ..... 388.98 |======================================================= GCC 11.0.1 ... 388.88 |======================================================= AMD AOCC 3.0 . 398.96 |======================================================== AOM AV1 3.0 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 1080p Frames Per Second > Higher Is Better Clang 12.0 . 88.78 |======================================================= Clang 11.0 . 86.09 |====================================================== GCC 9.3 .... 91.97 |========================================================= GCC 10.3 ... 93.05 |========================================================== GCC 11.0.1 . 94.40 |=========================================================== libavif avifenc 0.9.0 Encoder Speed: 6 Seconds < Lower Is Better Clang 12.0 ... 9.510 |=================================================== Clang 11.0 ... 9.536 |=================================================== GCC 9.3 ...... 10.399 |======================================================== GCC 10.3 ..... 10.417 |======================================================== GCC 11.0.1 ... 10.291 |======================================================= AMD AOCC 3.0 . 9.725 |==================================================== oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU ms < Lower Is Better Clang 12.0 ... 1307.49 |==================================================== Clang 11.0 ... 1277.62 |=================================================== GCC 9.3 ...... 1358.56 |====================================================== GCC 10.3 ..... 1379.51 |======================================================= AMD AOCC 3.0 . 1259.59 |================================================== oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU ms < Lower Is Better Clang 12.0 ... 1302.70 |==================================================== Clang 11.0 ... 1276.04 |=================================================== GCC 9.3 ...... 1357.29 |====================================================== GCC 10.3 ..... 1382.41 |======================================================= AMD AOCC 3.0 . 1267.18 |================================================== libavif avifenc 0.9.0 Encoder Speed: 0 Seconds < Lower Is Better Clang 12.0 ... 47.88 |==================================================== Clang 11.0 ... 47.89 |==================================================== GCC 9.3 ...... 52.22 |========================================================= GCC 10.3 ..... 51.45 |======================================================== GCC 11.0.1 ... 51.03 |======================================================== AMD AOCC 3.0 . 48.13 |===================================================== libavif avifenc 0.9.0 Encoder Speed: 10 Seconds < Lower Is Better Clang 12.0 ... 3.361 |==================================================== Clang 11.0 ... 3.429 |===================================================== GCC 9.3 ...... 3.659 |========================================================= GCC 10.3 ..... 3.643 |========================================================= GCC 11.0.1 ... 3.607 |======================================================== AMD AOCC 3.0 . 3.543 |======================================================= AOM AV1 3.0 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 1080p Frames Per Second > Higher Is Better Clang 12.0 . 26.85 |=========================================================== Clang 11.0 . 26.61 |========================================================== GCC 9.3 .... 24.84 |====================================================== GCC 10.3 ... 26.49 |========================================================== GCC 11.0.1 . 27.01 |=========================================================== WebP2 Image Encode 20210126 Encode Settings: Quality 100, Lossless Compression Seconds < Lower Is Better Clang 12.0 ... 374.04 |==================================================== Clang 11.0 ... 392.85 |====================================================== GCC 9.3 ...... 388.95 |====================================================== GCC 10.3 ..... 406.03 |======================================================== AMD AOCC 3.0 . 382.99 |===================================================== WebP2 Image Encode 20210126 Encode Settings: Quality 95, Compression Effort 7 Seconds < Lower Is Better Clang 12.0 ... 207.01 |==================================================== Clang 11.0 ... 203.63 |==================================================== GCC 9.3 ...... 220.94 |======================================================== GCC 10.3 ..... 215.57 |======================================================= AMD AOCC 3.0 . 205.03 |==================================================== oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better Clang 12.0 ... 1305.10 |==================================================== Clang 11.0 ... 1271.91 |=================================================== GCC 9.3 ...... 1356.91 |====================================================== GCC 10.3 ..... 1375.71 |======================================================= AMD AOCC 3.0 . 1268.08 |=================================================== WebP2 Image Encode 20210126 Encode Settings: Quality 75, Compression Effort 7 Seconds < Lower Is Better Clang 12.0 ... 109.53 |==================================================== Clang 11.0 ... 109.64 |==================================================== GCC 9.3 ...... 118.45 |======================================================== GCC 10.3 ..... 116.66 |======================================================= AMD AOCC 3.0 . 109.81 |==================================================== LZ4 Compression 1.9.3 Compression Level: 9 - Compression Speed MB/s > Higher Is Better Clang 12.0 ..... 48.50 |=================================================== Clang 11.0 ..... 49.01 |=================================================== Clang 12.0 LTO . 48.47 |=================================================== GCC 9.3 ........ 51.97 |======================================================= GCC 10.3 ....... 52.36 |======================================================= GCC 11.0.1 ..... 51.17 |====================================================== AMD AOCC 3.0 ... 50.32 |===================================================== FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 2048 Mflops > Higher Is Better Clang 12.0 ... 7789.9 |==================================================== Clang 11.0 ... 7878.5 |==================================================== GCC 9.3 ...... 8408.5 |======================================================== GCC 10.3 ..... 8134.5 |====================================================== GCC 11.0.1 ... 8231.1 |======================================================= AMD AOCC 3.0 . 7784.8 |==================================================== Timed MrBayes Analysis 3.2.7 Primate Phylogeny Analysis Seconds < Lower Is Better Clang 12.0 ..... 89.12 |==================================================== Clang 11.0 ..... 88.62 |==================================================== Clang 12.0 LTO . 93.63 |======================================================= GCC 9.3 ........ 89.16 |==================================================== GCC 10.3 ....... 93.66 |======================================================= GCC 11.0.1 ..... 89.43 |===================================================== AMD AOCC 3.0 ... 86.74 |=================================================== GraphicsMagick 1.3.33 Operation: Rotate Iterations Per Minute > Higher Is Better Clang 12.0 ... 712 |=========================================================== Clang 11.0 ... 665 |======================================================= GCC 9.3 ...... 709 |=========================================================== GCC 10.3 ..... 689 |========================================================= GCC 11.0.1 ... 694 |========================================================== AMD AOCC 3.0 . 660 |======================================================= SVT-HEVC 1.5.0 Tuning: 10 - Input: Bosphorus 1080p Frames Per Second > Higher Is Better Clang 12.0 ... 643.58 |======================================================= Clang 11.0 ... 652.74 |======================================================== GCC 9.3 ...... 605.50 |==================================================== GCC 10.3 ..... 615.62 |===================================================== GCC 11.0.1 ... 611.73 |==================================================== AMD AOCC 3.0 . 638.10 |======================================================= Ngspice 34 Circuit: C7552 Seconds < Lower Is Better Clang 12.0 ... 95.96 |========================================================= Clang 11.0 ... 90.53 |====================================================== GCC 9.3 ...... 89.09 |===================================================== GCC 10.3 ..... 90.43 |====================================================== GCC 11.0.1 ... 90.26 |====================================================== AMD AOCC 3.0 . 91.99 |======================================================= AOM AV1 3.0 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 1080p Frames Per Second > Higher Is Better Clang 12.0 . 7.10 |=========================================================== Clang 11.0 . 7.20 |============================================================ GCC 9.3 .... 6.69 |======================================================== GCC 10.3 ... 6.87 |========================================================= GCC 11.0.1 . 6.95 |========================================================== SVT-HEVC 1.5.0 Tuning: 7 - Input: Bosphorus 1080p Frames Per Second > Higher Is Better Clang 12.0 ... 345.30 |======================================================== Clang 11.0 ... 346.89 |======================================================== GCC 9.3 ...... 322.42 |==================================================== GCC 10.3 ..... 330.53 |===================================================== GCC 11.0.1 ... 329.32 |===================================================== AMD AOCC 3.0 . 343.85 |======================================================== Botan 2.17.3 Test: KASUMI MiB/s > Higher Is Better Clang 12.0 ... 82.64 |======================================================== Clang 11.0 ... 79.15 |===================================================== GCC 9.3 ...... 84.86 |========================================================= GCC 10.3 ..... 79.12 |===================================================== AMD AOCC 3.0 . 82.83 |======================================================== POV-Ray 3.7.0.7 Trace Time Seconds < Lower Is Better Clang 12.0 ... 9.296 |===================================================== Clang 11.0 ... 9.408 |====================================================== GCC 9.3 ...... 9.968 |========================================================= GCC 10.3 ..... 9.570 |======================================================= AMD AOCC 3.0 . 9.494 |====================================================== FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 1024 Mflops > Higher Is Better Clang 12.0 ... 50350 |====================================================== Clang 11.0 ... 50740 |====================================================== GCC 9.3 ...... 53275 |========================================================= GCC 10.3 ..... 52054 |======================================================== GCC 11.0.1 ... 51706 |======================================================= AMD AOCC 3.0 . 49685 |===================================================== libavif avifenc 0.9.0 Encoder Speed: 10, Lossless Seconds < Lower Is Better Clang 12.0 ... 5.746 |===================================================== Clang 11.0 ... 5.879 |====================================================== GCC 9.3 ...... 6.131 |========================================================= GCC 10.3 ..... 6.107 |========================================================= GCC 11.0.1 ... 6.149 |========================================================= AMD AOCC 3.0 . 5.948 |======================================================= SVT-HEVC 1.5.0 Tuning: 1 - Input: Bosphorus 1080p Frames Per Second > Higher Is Better Clang 12.0 ... 41.09 |========================================================= Clang 11.0 ... 41.01 |========================================================= GCC 9.3 ...... 38.41 |===================================================== GCC 10.3 ..... 39.03 |====================================================== GCC 11.0.1 ... 38.86 |====================================================== AMD AOCC 3.0 . 40.95 |========================================================= PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 250 - Mode: Read Write TPS > Higher Is Better Clang 12.0 . 56684 |=========================================================== Clang 11.0 . 54488 |========================================================= GCC 9.3 .... 53825 |======================================================== GCC 10.3 ... 53019 |======================================================= GCC 11.0.1 . 53102 |======================================================= JPEG XL 0.3.3 Input: PNG - Encode Speed: 7 MP/s > Higher Is Better Clang 12.0 ... 12.15 |========================================================= Clang 11.0 ... 12.01 |======================================================== AMD AOCC 3.0 . 11.37 |===================================================== PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 250 - Mode: Read Write - Average Latency ms < Lower Is Better Clang 12.0 . 4.431 |======================================================= Clang 11.0 . 4.603 |========================================================= GCC 9.3 .... 4.657 |========================================================== GCC 10.3 ... 4.731 |=========================================================== GCC 11.0.1 . 4.722 |=========================================================== JPEG XL 0.3.3 Input: PNG - Encode Speed: 5 MP/s > Higher Is Better Clang 12.0 ... 74.27 |===================================================== Clang 11.0 ... 78.41 |======================================================== AMD AOCC 3.0 . 79.23 |========================================================= SciMark 2.0 Computational Test: Monte Carlo Mflops > Higher Is Better Clang 12.0 ... 675.13 |======================================================= Clang 11.0 ... 674.86 |======================================================= GCC 9.3 ...... 668.10 |====================================================== GCC 10.3 ..... 682.87 |======================================================= GCC 11.0.1 ... 647.82 |===================================================== AMD AOCC 3.0 . 690.94 |======================================================== AOM AV1 3.0 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 4K Frames Per Second > Higher Is Better Clang 12.0 . 17.22 |========================================================== Clang 11.0 . 17.13 |========================================================== GCC 9.3 .... 16.29 |======================================================= GCC 10.3 ... 17.03 |========================================================== GCC 11.0.1 . 17.37 |=========================================================== WebP2 Image Encode 20210126 Encode Settings: Default Seconds < Lower Is Better Clang 12.0 ... 2.739 |====================================================== Clang 11.0 ... 2.743 |====================================================== GCC 9.3 ...... 2.778 |====================================================== GCC 10.3 ..... 2.918 |========================================================= AMD AOCC 3.0 . 2.816 |======================================================= AOM AV1 3.0 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 4K Frames Per Second > Higher Is Better Clang 12.0 . 38.11 |========================================================= Clang 11.0 . 37.28 |======================================================= GCC 9.3 .... 39.12 |========================================================== GCC 10.3 ... 39.32 |========================================================== GCC 11.0.1 . 39.71 |=========================================================== AOM AV1 3.0 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 4K Frames Per Second > Higher Is Better Clang 12.0 . 8.99 |======================================================== Clang 11.0 . 9.14 |========================================================= GCC 9.3 .... 9.57 |============================================================ GCC 10.3 ... 9.10 |========================================================= GCC 11.0.1 . 9.41 |=========================================================== x265 3.4 Video Input: Bosphorus 4K Frames Per Second > Higher Is Better Clang 12.0 ... 30.32 |========================================================= Clang 11.0 ... 29.94 |======================================================== GCC 9.3 ...... 28.91 |====================================================== GCC 10.3 ..... 28.60 |====================================================== GCC 11.0.1 ... 28.79 |====================================================== AMD AOCC 3.0 . 30.44 |========================================================= AOM AV1 3.0 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4K Frames Per Second > Higher Is Better Clang 12.0 . 33.39 |======================================================== Clang 11.0 . 33.14 |======================================================= GCC 9.3 .... 34.56 |========================================================== GCC 10.3 ... 35.26 |=========================================================== GCC 11.0.1 . 35.26 |=========================================================== AOM AV1 3.0 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 1080p Frames Per Second > Higher Is Better Clang 12.0 . 0.53 |============================================================ Clang 11.0 . 0.53 |============================================================ GCC 9.3 .... 0.50 |========================================================= GCC 10.3 ... 0.52 |=========================================================== GCC 11.0.1 . 0.52 |=========================================================== Tachyon 0.99b6 Total Time Seconds < Lower Is Better Clang 12.0 ... 16.05 |======================================================== Clang 11.0 ... 16.41 |========================================================= GCC 9.3 ...... 15.68 |====================================================== GCC 10.3 ..... 16.15 |======================================================== GCC 11.0.1 ... 15.50 |====================================================== AMD AOCC 3.0 . 16.06 |======================================================== LZ4 Compression 1.9.3 Compression Level: 3 - Compression Speed MB/s > Higher Is Better Clang 12.0 ..... 52.07 |===================================================== Clang 11.0 ..... 52.35 |===================================================== Clang 12.0 LTO . 50.93 |==================================================== GCC 9.3 ........ 53.83 |======================================================= GCC 10.3 ....... 52.87 |====================================================== GCC 11.0.1 ..... 51.32 |==================================================== AMD AOCC 3.0 ... 53.77 |======================================================= SVT-VP9 0.3 Tuning: Visual Quality Optimized - Input: Bosphorus 1080p Frames Per Second > Higher Is Better Clang 12.0 ... 372.49 |======================================================== Clang 11.0 ... 373.99 |======================================================== GCC 9.3 ...... 354.21 |===================================================== GCC 10.3 ..... 364.12 |======================================================= GCC 11.0.1 ... 366.39 |======================================================= AMD AOCC 3.0 . 373.89 |======================================================== Liquid-DSP 2021.01.31 Threads: 64 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better Clang 12.0 ... 3070633333 |==================================================== Clang 11.0 ... 3051366667 |=================================================== GCC 9.3 ...... 2940466667 |================================================= GCC 10.3 ..... 2942866667 |================================================= GCC 11.0.1 ... 2989400000 |================================================== AMD AOCC 3.0 . 3100400000 |==================================================== PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 1 - Mode: Read Only TPS > Higher Is Better Clang 12.0 . 24310 |========================================================== Clang 11.0 . 24943 |=========================================================== GCC 9.3 .... 23895 |========================================================= GCC 10.3 ... 24845 |=========================================================== GCC 11.0.1 . 23661 |======================================================== WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless Encode Time - Seconds < Lower Is Better Clang 12.0 ... 19.02 |======================================================== Clang 11.0 ... 18.57 |======================================================= GCC 9.3 ...... 19.30 |========================================================= GCC 10.3 ..... 18.88 |======================================================== GCC 11.0.1 ... 18.31 |====================================================== AMD AOCC 3.0 . 19.13 |======================================================== SVT-VP9 0.3 Tuning: VMAF Optimized - Input: Bosphorus 1080p Frames Per Second > Higher Is Better Clang 12.0 ... 487.43 |======================================================== Clang 11.0 ... 481.05 |======================================================= GCC 9.3 ...... 463.12 |===================================================== GCC 10.3 ..... 472.61 |====================================================== GCC 11.0.1 ... 472.32 |====================================================== AMD AOCC 3.0 . 476.95 |======================================================= SVT-VP9 0.3 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p Frames Per Second > Higher Is Better Clang 12.0 ... 488.23 |======================================================== Clang 11.0 ... 482.02 |======================================================= GCC 9.3 ...... 464.57 |===================================================== GCC 10.3 ..... 477.67 |======================================================= GCC 11.0.1 ... 478.16 |======================================================= AMD AOCC 3.0 . 478.62 |======================================================= PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 1 - Mode: Read Only - Average Latency ms < Lower Is Better Clang 12.0 . 0.041 |========================================================== Clang 11.0 . 0.040 |======================================================== GCC 9.3 .... 0.042 |=========================================================== GCC 10.3 ... 0.040 |======================================================== GCC 11.0.1 . 0.042 |=========================================================== AOM AV1 3.0 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 4K Frames Per Second > Higher Is Better Clang 12.0 . 0.21 |============================================================ Clang 11.0 . 0.21 |============================================================ GCC 9.3 .... 0.20 |========================================================= GCC 10.3 ... 0.21 |============================================================ GCC 11.0.1 . 0.21 |============================================================ Botan 2.17.3 Test: KASUMI - Decrypt MiB/s > Higher Is Better Clang 12.0 ... 84.23 |========================================================= Clang 11.0 ... 80.22 |====================================================== GCC 9.3 ...... 84.13 |========================================================= GCC 10.3 ..... 81.45 |======================================================= AMD AOCC 3.0 . 82.95 |======================================================== WebP Image Encode 1.1 Encode Settings: Default Encode Time - Seconds < Lower Is Better Clang 12.0 ... 1.331 |====================================================== Clang 11.0 ... 1.336 |======================================================= GCC 9.3 ...... 1.397 |========================================================= GCC 10.3 ..... 1.372 |======================================================== GCC 11.0.1 ... 1.386 |========================================================= AMD AOCC 3.0 . 1.351 |======================================================= SciMark 2.0 Computational Test: Dense LU Matrix Factorization Mflops > Higher Is Better Clang 12.0 ... 8848.40 |===================================================== Clang 11.0 ... 9146.88 |====================================================== GCC 9.3 ...... 9178.97 |====================================================== GCC 10.3 ..... 9248.89 |======================================================= GCC 11.0.1 ... 9263.55 |======================================================= AMD AOCC 3.0 . 9021.83 |====================================================== dav1d 0.8.2 Video Input: Chimera 1080p FPS > Higher Is Better Clang 12.0 ... 1198.22 |======================================================= Clang 11.0 ... 1190.41 |======================================================= GCC 9.3 ...... 1145.50 |===================================================== GCC 10.3 ..... 1171.04 |====================================================== GCC 11.0.1 ... 1180.44 |====================================================== AMD AOCC 3.0 . 1188.43 |======================================================= Botan 2.17.3 Test: CAST-256 - Decrypt MiB/s > Higher Is Better Clang 12.0 ... 133.05 |======================================================== Clang 11.0 ... 127.74 |====================================================== GCC 9.3 ...... 127.34 |====================================================== GCC 10.3 ..... 127.78 |====================================================== AMD AOCC 3.0 . 128.01 |====================================================== Botan 2.17.3 Test: CAST-256 MiB/s > Higher Is Better Clang 12.0 ... 132.82 |======================================================== Clang 11.0 ... 128.59 |====================================================== GCC 9.3 ...... 127.30 |====================================================== GCC 10.3 ..... 127.74 |====================================================== AMD AOCC 3.0 . 127.77 |====================================================== SciMark 2.0 Computational Test: Composite Mflops > Higher Is Better Clang 12.0 ... 3190.62 |===================================================== Clang 11.0 ... 3319.34 |======================================================= GCC 9.3 ...... 3229.22 |====================================================== GCC 10.3 ..... 3235.94 |====================================================== GCC 11.0.1 ... 3182.35 |===================================================== AMD AOCC 3.0 . 3298.29 |======================================================= Gcrypt Library 1.9 Seconds < Lower Is Better Clang 12.0 ... 236.92 |======================================================= Clang 11.0 ... 240.21 |======================================================== GCC 9.3 ...... 232.57 |====================================================== GCC 10.3 ..... 231.24 |====================================================== GCC 11.0.1 ... 233.51 |====================================================== AMD AOCC 3.0 . 240.41 |======================================================== FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 4096 Mflops > Higher Is Better Clang 12.0 ... 6744.1 |====================================================== Clang 11.0 ... 6823.8 |======================================================= GCC 9.3 ...... 7007.3 |======================================================== GCC 10.3 ..... 6974.0 |======================================================== GCC 11.0.1 ... 6948.2 |======================================================== AMD AOCC 3.0 . 6875.3 |======================================================= ASTC Encoder 2.4 Preset: Exhaustive Seconds < Lower Is Better Clang 12.0 ... 18.99 |======================================================= Clang 11.0 ... 19.03 |======================================================= GCC 9.3 ...... 19.48 |========================================================= GCC 10.3 ..... 19.46 |========================================================= GCC 11.0.1 ... 19.62 |========================================================= AMD AOCC 3.0 . 18.91 |======================================================= WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless, Highest Compression Encode Time - Seconds < Lower Is Better Clang 12.0 ... 38.45 |======================================================== Clang 11.0 ... 37.73 |======================================================= GCC 9.3 ...... 39.07 |========================================================= GCC 10.3 ..... 38.55 |======================================================== GCC 11.0.1 ... 37.95 |======================================================= AMD AOCC 3.0 . 38.34 |======================================================== AOM AV1 3.0 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 4K Frames Per Second > Higher Is Better Clang 12.0 . 4.87 |=========================================================== Clang 11.0 . 4.95 |============================================================ GCC 9.3 .... 4.78 |========================================================== GCC 10.3 ... 4.84 |=========================================================== GCC 11.0.1 . 4.84 |=========================================================== WebP Image Encode 1.1 Encode Settings: Quality 100 Encode Time - Seconds < Lower Is Better Clang 12.0 ... 2.199 |======================================================= Clang 11.0 ... 2.240 |======================================================== GCC 9.3 ...... 2.273 |========================================================= GCC 10.3 ..... 2.225 |======================================================== GCC 11.0.1 ... 2.274 |========================================================= AMD AOCC 3.0 . 2.262 |========================================================= FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 2048 Mflops > Higher Is Better Clang 12.0 ... 31935 |========================================================= Clang 11.0 ... 31741 |======================================================== GCC 9.3 ...... 31341 |======================================================== GCC 10.3 ..... 32061 |========================================================= GCC 11.0.1 ... 31662 |======================================================== AMD AOCC 3.0 . 31013 |======================================================= simdjson 0.8.2 Throughput Test: Kostya GB/s > Higher Is Better Clang 12.0 ... 2.75 |========================================================== Clang 11.0 ... 2.68 |======================================================== GCC 9.3 ...... 2.75 |========================================================== GCC 10.3 ..... 2.77 |========================================================== AMD AOCC 3.0 . 2.73 |========================================================= AOM AV1 3.0 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 1080p Frames Per Second > Higher Is Better Clang 12.0 . 22.13 |=========================================================== Clang 11.0 . 22.00 |=========================================================== GCC 9.3 .... 21.42 |========================================================= GCC 10.3 ... 21.64 |========================================================== GCC 11.0.1 . 22.11 |=========================================================== JPEG XL 0.3.3 Input: JPEG - Encode Speed: 8 MP/s > Higher Is Better Clang 12.0 ... 28.13 |========================================================= Clang 11.0 ... 27.24 |======================================================= AMD AOCC 3.0 . 27.29 |======================================================= PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 100 - Mode: Read Only - Average Latency ms < Lower Is Better Clang 12.0 . 0.094 |========================================================== Clang 11.0 . 0.094 |========================================================== GCC 9.3 .... 0.095 |=========================================================== GCC 10.3 ... 0.093 |========================================================== GCC 11.0.1 . 0.092 |========================================================= PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 100 - Mode: Read Only TPS > Higher Is Better Clang 12.0 . 1069022 |======================================================== Clang 11.0 . 1069367 |======================================================== GCC 9.3 .... 1057125 |======================================================= GCC 10.3 ... 1076357 |======================================================== GCC 11.0.1 . 1090824 |========================================================= PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 1 - Mode: Read Write TPS > Higher Is Better Clang 12.0 . 3281 |========================================================== Clang 11.0 . 3312 |=========================================================== GCC 9.3 .... 3298 |========================================================== GCC 10.3 ... 3369 |============================================================ GCC 11.0.1 . 3383 |============================================================ x265 3.4 Video Input: Bosphorus 1080p Frames Per Second > Higher Is Better Clang 12.0 ... 74.00 |========================================================= Clang 11.0 ... 73.36 |========================================================= GCC 9.3 ...... 72.14 |======================================================== GCC 10.3 ..... 72.60 |======================================================== GCC 11.0.1 ... 71.79 |======================================================= AMD AOCC 3.0 . 73.51 |========================================================= PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 1 - Mode: Read Write - Average Latency ms < Lower Is Better Clang 12.0 . 0.305 |=========================================================== Clang 11.0 . 0.302 |========================================================== GCC 9.3 .... 0.303 |=========================================================== GCC 10.3 ... 0.297 |========================================================= GCC 11.0.1 . 0.296 |========================================================= LZ4 Compression 1.9.3 Compression Level: 9 - Decompression Speed MB/s > Higher Is Better Clang 12.0 ..... 13926.5 |===================================================== Clang 11.0 ..... 13927.9 |===================================================== Clang 12.0 LTO . 13698.7 |==================================================== GCC 9.3 ........ 13895.3 |===================================================== GCC 10.3 ....... 13806.6 |===================================================== GCC 11.0.1 ..... 13857.4 |===================================================== AMD AOCC 3.0 ... 13561.5 |==================================================== LZ4 Compression 1.9.3 Compression Level: 3 - Decompression Speed MB/s > Higher Is Better Clang 12.0 ..... 13911.5 |===================================================== Clang 11.0 ..... 13840.3 |===================================================== Clang 12.0 LTO . 13715.0 |==================================================== GCC 9.3 ........ 13793.4 |===================================================== GCC 10.3 ....... 13906.1 |===================================================== GCC 11.0.1 ..... 13882.2 |===================================================== AMD AOCC 3.0 ... 13562.5 |==================================================== Opus Codec Encoding 1.3.1 WAV To Opus Encode Seconds < Lower Is Better Clang 12.0 . 7.567 |=========================================================== Clang 11.0 . 7.392 |========================================================== GCC 9.3 .... 7.504 |=========================================================== GCC 10.3 ... 7.469 |========================================================== GCC 11.0.1 . 7.381 |========================================================== JPEG XL 0.3.3 Input: PNG - Encode Speed: 8 MP/s > Higher Is Better Clang 12.0 ... 0.82 |========================================================== Clang 11.0 ... 0.80 |========================================================= AMD AOCC 3.0 . 0.81 |========================================================= dav1d 0.8.2 Video Input: Summer Nature 4K FPS > Higher Is Better Clang 12.0 ... 541.56 |======================================================== Clang 11.0 ... 543.43 |======================================================== GCC 9.3 ...... 530.82 |======================================================= GCC 10.3 ..... 536.71 |======================================================= GCC 11.0.1 ... 538.28 |======================================================= AMD AOCC 3.0 . 541.58 |======================================================== PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 250 - Mode: Read Only TPS > Higher Is Better Clang 12.0 . 1071209 |======================================================== Clang 11.0 . 1065506 |======================================================== GCC 9.3 .... 1067486 |======================================================== GCC 10.3 ... 1089731 |========================================================= GCC 11.0.1 . 1090160 |========================================================= PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 250 - Mode: Read Only - Average Latency ms < Lower Is Better Clang 12.0 . 0.234 |=========================================================== Clang 11.0 . 0.235 |=========================================================== GCC 9.3 .... 0.235 |=========================================================== GCC 10.3 ... 0.230 |========================================================== GCC 11.0.1 . 0.230 |========================================================== dav1d 0.8.2 Video Input: Summer Nature 1080p FPS > Higher Is Better Clang 12.0 ... 1244.11 |======================================================= Clang 11.0 ... 1251.25 |======================================================= GCC 9.3 ...... 1228.63 |====================================================== GCC 10.3 ..... 1245.11 |======================================================= GCC 11.0.1 ... 1249.74 |======================================================= AMD AOCC 3.0 . 1251.91 |======================================================= oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU ms < Lower Is Better Clang 12.0 ... 0.779776 |====================================================== Clang 11.0 ... 0.779101 |===================================================== GCC 9.3 ...... 0.786762 |====================================================== GCC 10.3 ..... 0.782476 |====================================================== AMD AOCC 3.0 . 0.773233 |===================================================== FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 1024 Mflops > Higher Is Better Clang 12.0 ... 36239 |========================================================= Clang 11.0 ... 36181 |========================================================= GCC 9.3 ...... 36321 |========================================================= GCC 10.3 ..... 35973 |======================================================== GCC 11.0.1 ... 35718 |======================================================== AMD AOCC 3.0 . 36100 |========================================================= JPEG XL 0.3.3 Input: JPEG - Encode Speed: 5 MP/s > Higher Is Better Clang 12.0 ... 66.66 |========================================================= Clang 11.0 ... 65.58 |======================================================== AMD AOCC 3.0 . 65.57 |======================================================== JPEG XL 0.3.3 Input: JPEG - Encode Speed: 7 MP/s > Higher Is Better Clang 12.0 ... 66.38 |========================================================= Clang 11.0 ... 65.43 |======================================================== AMD AOCC 3.0 . 65.68 |======================================================== ONNX Runtime 1.6 Model: super-resolution-10 - Device: OpenMP CPU Inferences Per Minute > Higher Is Better Clang 12.0 ... 4456 |============================================== Clang 11.0 ... 4523 |=============================================== GCC 9.3 ...... 5183 |====================================================== GCC 10.3 ..... 5559 |========================================================== AMD AOCC 3.0 . 4383 |============================================== ONNX Runtime 1.6 Model: bertsquad-10 - Device: OpenMP CPU Inferences Per Minute > Higher Is Better Clang 12.0 ... 498 |========================================================== Clang 11.0 ... 471 |======================================================= GCC 9.3 ...... 495 |========================================================== GCC 10.3 ..... 505 |=========================================================== AMD AOCC 3.0 . 459 |====================================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-N GB/s > Higher Is Better Clang 12.0 ... 69.1 |========================================================== Clang 11.0 ... 51.2 |=========================================== GCC 9.3 ...... 65.0 |======================================================= GCC 10.3 ..... 56.2 |=============================================== GCC 11.0.1 ... 63.9 |====================================================== AMD AOCC 3.0 . 55.2 |============================================== ViennaCL 1.7.1 Test: CPU BLAS - sDOT GB/s > Higher Is Better Clang 12.0 ... 434.00 |===================================== Clang 11.0 ... 462.00 |======================================== GCC 9.3 ...... 636.00 |======================================================= GCC 10.3 ..... 592.97 |=================================================== GCC 11.0.1 ... 649.00 |======================================================== AMD AOCC 3.0 . 477.00 |========================================= ViennaCL 1.7.1 Test: CPU BLAS - sAXPY GB/s > Higher Is Better Clang 12.0 ... 357.0 |============= Clang 11.0 ... 412.0 |=============== GCC 9.3 ...... 813.0 |============================== GCC 10.3 ..... 1350.0 |=================================================== GCC 11.0.1 ... 1496.0 |======================================================== AMD AOCC 3.0 . 326.0 |============ ViennaCL 1.7.1 Test: CPU BLAS - sCOPY GB/s > Higher Is Better Clang 12.0 ... 471.00 |===================== Clang 11.0 ... 495.00 |====================== GCC 9.3 ...... 1217.00 |======================================================= GCC 10.3 ..... 1065.60 |================================================ GCC 11.0.1 ... 1210.00 |======================================================= AMD AOCC 3.0 . 531.00 |========================