AMD Kaveri Compiler Tests AMD A10-7850K Kaveri APU on Linux running some quick tests between GCC and LLVM Clang of varying releases and obtained from the Ubuntu Linux archive as well as the LLVM.org APT repository for LLVM/Clang. Compiler AMD Linux tests by Michael Larabel. GCC 4.8.2: Processor: AMD A10-7850K APU with Radeon R7 @ 3.70GHz (4 Cores), Motherboard: ASUS A88X-PRO, Chipset: AMD Device 1422, Memory: 7168MB, Disk: 240GB OCZ VERTEX3, Graphics: ASUS AMD Radeon R7 1024MB (960/1066MHz), Audio: ATI R6xx HDMI, Monitor: VA2431, Network: Realtek RTL8111/8168/8411 OS: Ubuntu 13.10, Kernel: 3.12.0-031200-generic (x86_64), Desktop: Unity 7.1.2, Display Server: X Server 1.14.5, Display Driver: fglrx 13.30.1, OpenGL: 4.3.12682, Compiler: GCC 4.8, File-System: ext4, Screen Resolution: 1920x1080 LLVM Clang 3.2: Processor: AMD A10-7850K APU with Radeon R7 @ 3.70GHz (4 Cores), Motherboard: ASUS A88X-PRO, Chipset: AMD Device 1422, Memory: 7168MB, Disk: 240GB OCZ VERTEX3, Graphics: ASUS AMD Radeon R7 1024MB (960/1066MHz), Audio: ATI R6xx HDMI, Monitor: VA2431, Network: Realtek RTL8111/8168/8411 OS: Ubuntu 13.10, Kernel: 3.12.0-031200-generic (x86_64), Desktop: Unity 7.1.2, Display Server: X Server 1.14.5, Display Driver: fglrx 13.30.1, OpenGL: 4.3.12682, Compiler: Clang 3.2-7ubuntu1, File-System: ext4, Screen Resolution: 1920x1080 LLVM Clang 3.3: Processor: AMD A10-7850K APU with Radeon R7 @ 3.70GHz (4 Cores), Motherboard: ASUS A88X-PRO, Chipset: AMD Device 1422, Memory: 7168MB, Disk: 240GB OCZ VERTEX3, Graphics: ASUS AMD Radeon R7 1024MB (960/1066MHz), Audio: ATI R6xx HDMI, Monitor: VA2431, Network: Realtek RTL8111/8168/8411 OS: Ubuntu 13.10, Kernel: 3.12.0-031200-generic (x86_64), Desktop: Unity 7.1.2, Display Server: X Server 1.14.5, Display Driver: fglrx 13.30.1, OpenGL: 4.3.12682, Compiler: Clang 3.3-5ubuntu4, File-System: ext4, Screen Resolution: 1920x1080 LLVM Clang 3.4: Processor: AMD A10-7850K APU with Radeon R7 @ 3.70GHz (4 Cores), Motherboard: ASUS A88X-PRO, Chipset: AMD Device 1422, Memory: 7168MB, Disk: 240GB OCZ VERTEX3, Graphics: ASUS AMD Radeon R7 1024MB (960/1066MHz), Audio: ATI R6xx HDMI, Monitor: VA2431, Network: Realtek RTL8111/8168/8411 OS: Ubuntu 13.10, Kernel: 3.12.0-031200-generic (x86_64), Desktop: Unity 7.1.2, Display Server: X Server 1.14.5, Display Driver: fglrx 13.30.1, OpenGL: 4.3.12682, Compiler: Clang 3.4-1~gd~s, File-System: ext4, Screen Resolution: 1920x1080 LLVM Clang 3.5 SVN: Processor: AMD A10-7850K APU with Radeon R7 @ 3.70GHz (4 Cores), Motherboard: ASUS A88X-PRO, Chipset: AMD Device 1422, Memory: 7168MB, Disk: 240GB OCZ VERTEX3, Graphics: ASUS AMD Radeon R7 1024MB (960/1066MHz), Audio: ATI R6xx HDMI, Monitor: VA2431, Network: Realtek RTL8111/8168/8411 OS: Ubuntu 13.10, Kernel: 3.12.0-031200-generic (x86_64), Desktop: Unity 7.1.2, Display Server: X Server 1.14.5, Display Driver: fglrx 13.30.1, OpenGL: 4.3.12682, Compiler: Clang 3.5-1~exp1, File-System: ext4, Screen Resolution: 1920x1080 John The Ripper 1.8.0 Test: Blowfish Real C/S > Higher Is Better GCC 4.8.2 .......... 3172 |==================================================== LLVM Clang 3.2 ..... 777 |============= LLVM Clang 3.3 ..... 779 |============= LLVM Clang 3.4 ..... 809 |============= LLVM Clang 3.5 SVN . 806 |============= Smallpt 1.0 Global Illumination Renderer; 100 Samples Seconds < Lower Is Better GCC 4.8.2 .......... 83 |============== LLVM Clang 3.2 ..... 304 |=================================================== LLVM Clang 3.3 ..... 307 |=================================================== LLVM Clang 3.4 ..... 314 |==================================================== LLVM Clang 3.5 SVN . 317 |===================================================== Rodinia 2.4 Test: OpenMP Streamcluster Seconds < Lower Is Better GCC 4.8.2 .......... 69.60 |=============== LLVM Clang 3.4 ..... 224.98 |================================================== LLVM Clang 3.5 SVN . 225.51 |================================================== Rodinia 2.4 Test: OpenMP CFD Solver Seconds < Lower Is Better GCC 4.8.2 .......... 181.99 |======================= LLVM Clang 3.4 ..... 403.36 |================================================== LLVM Clang 3.5 SVN . 400.26 |================================================== C-Ray 1.1 Total Time Seconds < Lower Is Better GCC 4.8.2 .......... 47.94 |============================= LLVM Clang 3.2 ..... 84.93 |=================================================== LLVM Clang 3.3 ..... 84.82 |=================================================== LLVM Clang 3.4 ..... 83.23 |================================================== LLVM Clang 3.5 SVN . 83.25 |================================================== Timed PHP Compilation 5.2.9 Time To Compile Seconds < Lower Is Better GCC 4.8.2 .......... 62.92 |=================================================== LLVM Clang 3.2 ..... 35.91 |============================= LLVM Clang 3.3 ..... 35.85 |============================= LLVM Clang 3.4 ..... 36.53 |============================== LLVM Clang 3.5 SVN . 37.21 |============================== Timed Apache Compilation 2.4.7 Time To Compile Seconds < Lower Is Better GCC 4.8.2 .......... 62.51 |=================================================== LLVM Clang 3.2 ..... 39.77 |================================ LLVM Clang 3.3 ..... 39.07 |================================ LLVM Clang 3.4 ..... 41.42 |================================== LLVM Clang 3.5 SVN . 41.65 |================================== SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation Mflops > Higher Is Better GCC 4.8.2 .......... 619.37 |=============================== LLVM Clang 3.2 ..... 989.00 |================================================== LLVM Clang 3.3 ..... 976.56 |================================================= LLVM Clang 3.4 ..... 987.41 |================================================== LLVM Clang 3.5 SVN . 976.41 |================================================= Hierarchical INTegration 1.0 Test: FLOAT QUIPs > Higher Is Better GCC 4.8.2 .......... 167321466.12 |============================================ LLVM Clang 3.2 ..... 126855139.19 |================================= LLVM Clang 3.3 ..... 116609505.33 |=============================== LLVM Clang 3.4 ..... 131700715.89 |=================================== LLVM Clang 3.5 SVN . 106584321.58 |============================ LAME MP3 Encoding 3.99.3 WAV To MP3 Seconds < Lower Is Better GCC 4.8.2 .......... 22.05 |=========================================== LLVM Clang 3.2 ..... 22.74 |============================================= LLVM Clang 3.3 ..... 22.37 |============================================ LLVM Clang 3.4 ..... 20.18 |======================================== LLVM Clang 3.5 SVN . 25.96 |=================================================== Timed MrBayes Analysis 3.1.2 Primate Phylogeny Analysis Seconds < Lower Is Better GCC 4.8.2 .......... 22.66 |========================================= LLVM Clang 3.2 ..... 27.88 |================================================== LLVM Clang 3.3 ..... 28.49 |=================================================== LLVM Clang 3.4 ..... 28.10 |================================================== LLVM Clang 3.5 SVN . 28.17 |================================================== Timed HMMer Search 2.3.2 Pfam Database Search Seconds < Lower Is Better GCC 4.8.2 .......... 22.64 |========================================== LLVM Clang 3.2 ..... 27.69 |=================================================== LLVM Clang 3.3 ..... 22.83 |========================================== LLVM Clang 3.4 ..... 22.59 |========================================== LLVM Clang 3.5 SVN . 22.76 |========================================== TSCP 1.81 AI Chess Performance Nodes Per Second > Higher Is Better GCC 4.8.2 .......... 697373 |================================================== LLVM Clang 3.2 ..... 602032 |=========================================== LLVM Clang 3.3 ..... 601419 |=========================================== LLVM Clang 3.4 ..... 573201 |========================================= LLVM Clang 3.5 SVN . 581799 |========================================== FLAC Audio Encoding 1.3.0 WAV To FLAC Seconds < Lower Is Better GCC 4.8.2 .......... 9.11 |============================================= LLVM Clang 3.2 ..... 10.38 |=================================================== LLVM Clang 3.3 ..... 9.87 |================================================ LLVM Clang 3.4 ..... 8.72 |=========================================== LLVM Clang 3.5 SVN . 8.58 |========================================== SciMark 2.0 Computational Test: Composite Mflops > Higher Is Better GCC 4.8.2 .......... 551.12 |=========================================== LLVM Clang 3.2 ..... 621.50 |================================================ LLVM Clang 3.3 ..... 633.06 |================================================= LLVM Clang 3.4 ..... 644.57 |================================================== LLVM Clang 3.5 SVN . 639.48 |================================================== Stream 2013-01-17 Type: Copy MB/s > Higher Is Better GCC 4.8.2 .......... 10514.31 |================================================ LLVM Clang 3.2 ..... 10144.09 |============================================== LLVM Clang 3.3 ..... 9438.23 |=========================================== LLVM Clang 3.4 ..... 9398.01 |=========================================== LLVM Clang 3.5 SVN . 9942.82 |============================================= SciMark 2.0 Computational Test: Dense LU Matrix Factorization Mflops > Higher Is Better GCC 4.8.2 .......... 955.80 |============================================= LLVM Clang 3.2 ..... 1001.57 |=============================================== LLVM Clang 3.3 ..... 998.79 |=============================================== LLVM Clang 3.4 ..... 1046.78 |================================================= LLVM Clang 3.5 SVN . 1035.60 |================================================ Minion 0.15 Benchmark: Solitaire Seconds < Lower Is Better GCC 4.8.2 ...... 148.46 |================================================= LLVM Clang 3.2 . 162.35 |====================================================== LLVM Clang 3.3 . 153.27 |=================================================== Stream 2013-01-17 Type: Triad MB/s > Higher Is Better GCC 4.8.2 .......... 6964.29 |============================================== LLVM Clang 3.2 ..... 7430.89 |================================================= LLVM Clang 3.3 ..... 6850.19 |============================================= LLVM Clang 3.4 ..... 6856.47 |============================================= LLVM Clang 3.5 SVN . 7258.64 |================================================ Stream 2013-01-17 Type: Add MB/s > Higher Is Better GCC 4.8.2 .......... 6950.99 |============================================== LLVM Clang 3.2 ..... 7417.64 |================================================= LLVM Clang 3.3 ..... 6896.16 |============================================== LLVM Clang 3.4 ..... 6867.13 |============================================= LLVM Clang 3.5 SVN . 7193.37 |================================================ Minion 0.15 Benchmark: Quasigroup Seconds < Lower Is Better GCC 4.8.2 ...... 197.36 |=================================================== LLVM Clang 3.2 . 210.30 |====================================================== LLVM Clang 3.3 . 209.50 |====================================================== Stream 2013-01-17 Type: Scale MB/s > Higher Is Better GCC 4.8.2 .......... 6545.78 |================================================ LLVM Clang 3.2 ..... 6322.46 |=============================================== LLVM Clang 3.3 ..... 6313.74 |=============================================== LLVM Clang 3.4 ..... 6342.96 |=============================================== LLVM Clang 3.5 SVN . 6615.34 |================================================= Himeno Benchmark 3.0 Poisson Pressure Solver MFLOPS > Higher Is Better GCC 4.8.2 .......... 814.79 |================================================== LLVM Clang 3.2 ..... 778.87 |================================================ LLVM Clang 3.3 ..... 789.57 |================================================ LLVM Clang 3.4 ..... 796.14 |================================================= LLVM Clang 3.5 SVN . 784.69 |================================================ SciMark 2.0 Computational Test: Sparse Matrix Multiply Mflops > Higher Is Better GCC 4.8.2 .......... 709.27 |================================================ LLVM Clang 3.2 ..... 730.13 |================================================== LLVM Clang 3.3 ..... 727.54 |================================================== LLVM Clang 3.4 ..... 730.16 |================================================== LLVM Clang 3.5 SVN . 733.40 |================================================== OpenSSL 1.0.1f RSA 4096-bit Performance Signs Per Second > Higher Is Better GCC 4.8.2 .......... 272.63 |================================================= LLVM Clang 3.2 ..... 275.70 |================================================== LLVM Clang 3.3 ..... 276.20 |================================================== LLVM Clang 3.4 ..... 277.97 |================================================== LLVM Clang 3.5 SVN . 277.87 |================================================== Minion 0.15 Benchmark: Graceful Seconds < Lower Is Better GCC 4.8.2 ...... 91.28 |======================================================= LLVM Clang 3.2 . 90.57 |======================================================= LLVM Clang 3.3 . 89.67 |====================================================== NGINX Benchmark 1.0.11 Static Web Page Serving Requests Per Second > Higher Is Better GCC 4.8.2 .......... 23202.65 |================================================ LLVM Clang 3.2 ..... 22909.44 |=============================================== LLVM Clang 3.3 ..... 22865.18 |=============================================== LLVM Clang 3.4 ..... 22858.95 |=============================================== LLVM Clang 3.5 SVN . 22886.06 |=============================================== FFmpeg 2.1.1 H.264 HD To NTSC DV Seconds < Lower Is Better GCC 4.8.2 .......... 22.18 |=================================================== LLVM Clang 3.2 ..... 21.89 |================================================== LLVM Clang 3.3 ..... 22.04 |=================================================== LLVM Clang 3.4 ..... 22.04 |=================================================== LLVM Clang 3.5 SVN . 22.17 |=================================================== SciMark 2.0 Computational Test: Fast Fourier Transform Mflops > Higher Is Better GCC 4.8.2 .......... 96.36 |================================================== LLVM Clang 3.2 ..... 97.23 |=================================================== LLVM Clang 3.3 ..... 97.38 |=================================================== LLVM Clang 3.4 ..... 96.36 |================================================== LLVM Clang 3.5 SVN . 96.14 |================================================== x264 2014-01-09 H.264 Video Encoding Frames Per Second > Higher Is Better GCC 4.8.2 .......... 77.08 |=================================================== LLVM Clang 3.3 ..... 77.14 |=================================================== LLVM Clang 3.4 ..... 77.50 |=================================================== LLVM Clang 3.5 SVN . 76.57 |================================================== Apache Benchmark 2.4.7 Static Web Page Serving Requests Per Second > Higher Is Better GCC 4.8.2 .......... 18626.27 |================================================ LLVM Clang 3.2 ..... 18582.56 |================================================ LLVM Clang 3.3 ..... 18670.42 |================================================ LLVM Clang 3.4 ..... 18530.06 |================================================ LLVM Clang 3.5 SVN . 18519.78 |================================================ POV-Ray 3.7.0 Total Time Seconds < Lower Is Better GCC 4.8.2 . 305.66 |=========================================================== TTSIOD 3D Renderer 2.2z Phong Rendering With Soft-Shadow Mapping FPS > Higher Is Better GCC 4.8.2 . 60.20 |============================================================ Botan 1.11.6 Test: X9.19-MAC Mbytes/s > Higher Is Better GCC 4.8.2 . 57.46 |============================================================ Botan 1.11.6 Test: CAST-256 Mbytes/s > Higher Is Better GCC 4.8.2 . 71.72 |============================================================ Botan 1.11.6 Test: Twofish Mbytes/s > Higher Is Better GCC 4.8.2 . 153.02 |=========================================================== Botan 1.11.6 Test: AES-256 Mbytes/s > Higher Is Better GCC 4.8.2 . 3484.21 |========================================================== Botan 1.11.6 Test: KASUMI Mbytes/s > Higher Is Better GCC 4.8.2 . 57.60 |============================================================ Botan 1.11.6 Test: Tiger Mbytes/s > Higher Is Better GCC 4.8.2 . 330.03 |=========================================================== Parboil 2.5 Test: OpenMP Stencil Seconds < Lower Is Better GCC 4.8.2 . 57.17 |============================================================ Parboil 2.5 Test: OpenMP CUTCP Seconds < Lower Is Better GCC 4.8.2 . 39.12 |============================================================ Parboil 2.5 Test: OpenMP LBM Seconds < Lower Is Better GCC 4.8.2 . 492.16 |=========================================================== SciMark 2.0 Computational Test: Monte Carlo Mflops > Higher Is Better GCC 4.8.2 .......... 374.78 |================================================== LLVM Clang 3.2 ..... 289.57 |======================================= LLVM Clang 3.3 ..... 365.03 |================================================= LLVM Clang 3.4 ..... 362.15 |================================================ LLVM Clang 3.5 SVN . 355.84 |=============================================== BLAKE2 20130131 Phoronix Test Suite v5.0.0m0 Cycles Per Byte < Lower Is Better GCC 4.8.2 .......... 11.11 |=================================================== LLVM Clang 3.2 ..... 11.15 |=================================================== LLVM Clang 3.3 ..... 10.04 |============================================== LLVM Clang 3.4 ..... 8.87 |========================================= LLVM Clang 3.5 SVN . 9.92 |=============================================