GCC 4.9 Compiler Optimization Tuning AMD Kaveri AMD Steamroller CPU Cores on AMD A10-7850K Kaveri APU compiler optimization tuning with various march= values. Benchmarks by Michael Larabel for a future article on Phoronix.com. k8: Processor: AMD A10-7850K APU with Radeon R7 @ 3.70GHz (4 Cores), Motherboard: Gigabyte F2A88XM-D3H, Chipset: AMD Device 1422, Memory: 7168MB, Disk: 120GB KINGSTON SV300S3, Graphics: AMD Kaveri 1024MB, Audio: ATI R6xx HDMI, Monitor: TSB-TV, Network: Realtek RTL8111/8168/8411 OS: Ubuntu 14.04, Kernel: 3.13.0-5-generic (x86_64), Desktop: Unity 7.1.2, Display Driver: radeon 7.2.99, Compiler: GCC 4.9.0 20140126 + Clang 3.4 + LLVM 3.4, File-System: ext4, Screen Resolution: 1920x1080 barcelona: Processor: AMD A10-7850K APU with Radeon R7 @ 3.70GHz (4 Cores), Motherboard: Gigabyte F2A88XM-D3H, Chipset: AMD Device 1422, Memory: 7168MB, Disk: 120GB KINGSTON SV300S3, Graphics: AMD Kaveri 1024MB, Audio: ATI R6xx HDMI, Monitor: TSB-TV, Network: Realtek RTL8111/8168/8411 OS: Ubuntu 14.04, Kernel: 3.13.0-5-generic (x86_64), Desktop: Unity 7.1.2, Display Driver: radeon 7.2.99, Compiler: GCC 4.9.0 20140126 + Clang 3.4 + LLVM 3.4, File-System: ext4, Screen Resolution: 1920x1080 bdver1: Processor: AMD A10-7850K APU with Radeon R7 @ 3.70GHz (4 Cores), Motherboard: Gigabyte F2A88XM-D3H, Chipset: AMD Device 1422, Memory: 7168MB, Disk: 120GB KINGSTON SV300S3, Graphics: AMD Kaveri 1024MB, Audio: ATI R6xx HDMI, Monitor: TSB-TV, Network: Realtek RTL8111/8168/8411 OS: Ubuntu 14.04, Kernel: 3.13.0-5-generic (x86_64), Desktop: Unity 7.1.2, Display Driver: radeon 7.2.99, Compiler: GCC 4.9.0 20140126 + Clang 3.4 + LLVM 3.4, File-System: ext4, Screen Resolution: 1920x1080 bdver2: Processor: AMD A10-7850K APU with Radeon R7 @ 3.70GHz (4 Cores), Motherboard: Gigabyte F2A88XM-D3H, Chipset: AMD Device 1422, Memory: 7168MB, Disk: 120GB KINGSTON SV300S3, Graphics: AMD Kaveri 1024MB, Audio: ATI R6xx HDMI, Monitor: TSB-TV, Network: Realtek RTL8111/8168/8411 OS: Ubuntu 14.04, Kernel: 3.13.0-5-generic (x86_64), Desktop: Unity 7.1.2, Display Driver: radeon 7.2.99, Compiler: GCC 4.9.0 20140126 + Clang 3.4 + LLVM 3.4, File-System: ext4, Screen Resolution: 1920x1080 bdver3: Processor: AMD A10-7850K APU with Radeon R7 @ 3.70GHz (4 Cores), Motherboard: Gigabyte F2A88XM-D3H, Chipset: AMD Device 1422, Memory: 7168MB, Disk: 120GB KINGSTON SV300S3, Graphics: AMD Kaveri 1024MB, Audio: ATI R6xx HDMI, Monitor: TSB-TV, Network: Realtek RTL8111/8168/8411 OS: Ubuntu 14.04, Kernel: 3.13.0-5-generic (x86_64), Desktop: Unity 7.1.2, Display Driver: radeon 7.2.99, Compiler: GCC 4.9.0 20140126 + Clang 3.4 + LLVM 3.4, File-System: ext4, Screen Resolution: 1920x1080 SciMark 2.0 Computational Test: Composite Mflops > Higher Is Better k8 ........ 636.54 |========================================================== barcelona . 629.44 |========================================================== bdver1 .... 640.56 |=========================================================== bdver2 .... 644.89 |=========================================================== bdver3 .... 641.05 |=========================================================== SciMark 2.0 Computational Test: Monte Carlo Mflops > Higher Is Better k8 ........ 397.86 |======================================================= barcelona . 384.47 |====================================================== bdver1 .... 423.73 |=========================================================== bdver2 .... 423.28 |=========================================================== bdver3 .... 413.64 |========================================================== SciMark 2.0 Computational Test: Fast Fourier Transform Mflops > Higher Is Better k8 ........ 73.59 |============================================================ barcelona . 68.99 |======================================================== bdver1 .... 68.50 |======================================================== bdver2 .... 71.15 |========================================================== bdver3 .... 70.77 |========================================================== SciMark 2.0 Computational Test: Sparse Matrix Multiply Mflops > Higher Is Better k8 ........ 865.92 |========================================================== barcelona . 849.34 |========================================================= bdver1 .... 860.41 |========================================================== bdver2 .... 877.91 |=========================================================== bdver3 .... 866.81 |========================================================== SciMark 2.0 Computational Test: Dense LU Matrix Factorization Mflops > Higher Is Better k8 ........ 1156.29 |========================================================== barcelona . 1155.51 |========================================================= bdver1 .... 1162.41 |========================================================== bdver2 .... 1164.64 |========================================================== bdver3 .... 1165.94 |========================================================== SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation Mflops > Higher Is Better k8 ........ 689.02 |=========================================================== barcelona . 688.88 |=========================================================== bdver1 .... 687.76 |=========================================================== bdver2 .... 687.47 |=========================================================== bdver3 .... 688.08 |=========================================================== TSCP 1.81 AI Chess Performance Nodes Per Second > Higher Is Better k8 ........ 760113 |=========================================================== barcelona . 742690 |========================================================== bdver1 .... 738311 |========================================================= bdver2 .... 738707 |========================================================= bdver3 .... 739101 |========================================================= x264 2014-01-09 H.264 Video Encoding Frames Per Second > Higher Is Better k8 ........ 83.43 |=========================================================== barcelona . 83.66 |============================================================ bdver1 .... 84.14 |============================================================ bdver2 .... 83.85 |============================================================ bdver3 .... 83.83 |============================================================ GraphicsMagick 1.3.19 Operation: Blur Iterations Per Minute > Higher Is Better k8 ........ 97 |======================================================= barcelona . 93 |==================================================== bdver1 .... 108 |============================================================= bdver2 .... 110 |============================================================== bdver3 .... 106 |============================================================ GraphicsMagick 1.3.19 Operation: Sharpen Iterations Per Minute > Higher Is Better k8 ........ 71 |=================================================== barcelona . 72 |==================================================== bdver1 .... 87 |=============================================================== bdver2 .... 81 |=========================================================== bdver3 .... 81 |=========================================================== GraphicsMagick 1.3.19 Operation: Resizing Iterations Per Minute > Higher Is Better k8 ........ 106 |================================================= barcelona . 120 |======================================================== bdver1 .... 133 |============================================================== bdver2 .... 133 |============================================================== bdver3 .... 133 |============================================================== GraphicsMagick 1.3.19 Operation: HWB Color Space Iterations Per Minute > Higher Is Better k8 ........ 126 |======================================================== barcelona . 138 |============================================================== bdver1 .... 139 |============================================================== bdver2 .... 139 |============================================================== bdver3 .... 139 |============================================================== GraphicsMagick 1.3.19 Operation: Local Adaptive Thresholding Iterations Per Minute > Higher Is Better k8 ........ 77 |============================================================ barcelona . 76 |=========================================================== bdver1 .... 81 |=============================================================== bdver2 .... 81 |=============================================================== bdver3 .... 80 |============================================================== Himeno Benchmark 3.0 Poisson Pressure Solver MFLOPS > Higher Is Better k8 ........ 867.36 |========================================================= barcelona . 898.33 |=========================================================== bdver1 .... 894.27 |========================================================== bdver2 .... 905.30 |=========================================================== bdver3 .... 902.98 |=========================================================== Timed Apache Compilation 2.4.7 Time To Compile Seconds < Lower Is Better k8 ........ 58.55 |=========================================================== barcelona . 58.68 |============================================================ bdver1 .... 59.07 |============================================================ bdver2 .... 58.91 |============================================================ bdver3 .... 58.83 |============================================================ Timed PHP Compilation 5.2.9 Time To Compile Seconds < Lower Is Better k8 ........ 56.54 |========================================================== barcelona . 56.60 |========================================================== bdver1 .... 58.45 |============================================================ bdver2 .... 58.54 |============================================================ bdver3 .... 58.48 |============================================================ C-Ray 1.1 Total Time Seconds < Lower Is Better k8 ........ 87.90 |============================================================ barcelona . 53.33 |==================================== bdver1 .... 40.67 |============================ bdver2 .... 40.55 |============================ bdver3 .... 40.54 |============================ FLAC Audio Encoding 1.3.0 WAV To FLAC Seconds < Lower Is Better k8 ........ 6.62 |=========================================================== barcelona . 6.90 |============================================================= bdver1 .... 5.29 |=============================================== bdver2 .... 5.47 |================================================ bdver3 .... 5.52 |=================================================