GCC 4.9 Compiler Optimization Tuning AMD Kaveri AMD Steamroller CPU Cores on AMD A10-7850K Kaveri APU compiler optimization tuning with various march= values. Benchmarks by Michael Larabel for a future article on Phoronix.com.
HTML result view exported from: https://openbenchmarking.org/result/1401282-PL-GCC49COMP74&gru&rdt .
GCC 4.9 Compiler Optimization Tuning AMD Kaveri Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Driver Compiler File-System Screen Resolution bdver3 bdver2 bdver1 barcelona k8 AMD A10-7850K APU with Radeon R7 @ 3.70GHz (4 Cores) Gigabyte F2A88XM-D3H AMD Device 1422 7168MB 120GB KINGSTON SV300S3 AMD Kaveri 1024MB ATI R6xx HDMI TSB-TV Realtek RTL8111/8168/8411 Ubuntu 14.04 3.13.0-5-generic (x86_64) Unity 7.1.2 radeon 7.2.99 GCC 4.9.0 20140126 + Clang 3.4 + LLVM 3.4 ext4 1920x1080 OpenBenchmarking.org Kernel Details - radeon.dpm=1 Compiler Details - --disable-multilib --enable-checking=release --enable-languages=c,c++,fortran Processor Details - Scaling Governor: acpi-cpufreq ondemand
GCC 4.9 Compiler Optimization Tuning AMD Kaveri x264: H.264 Video Encoding graphics-magick: Blur graphics-magick: Sharpen graphics-magick: Resizing graphics-magick: HWB Color Space graphics-magick: Local Adaptive Thresholding scimark2: Composite scimark2: Monte Carlo scimark2: Fast Fourier Transform scimark2: Sparse Matrix Multiply scimark2: Dense LU Matrix Factorization scimark2: Jacobi Successive Over-Relaxation himeno: Poisson Pressure Solver tscp: AI Chess Performance build-apache: Time To Compile build-php: Time To Compile c-ray: Total Time encode-flac: WAV To FLAC bdver3 bdver2 bdver1 barcelona k8 83.83 106 81 133 139 80 641.05 413.64 70.77 866.81 1165.94 688.08 902.98 739101 58.83 58.48 40.54 5.52 83.85 110 81 133 139 81 644.89 423.28 71.15 877.91 1164.64 687.47 905.30 738707 58.91 58.54 40.55 5.47 84.14 108 87 133 139 81 640.56 423.73 68.50 860.41 1162.41 687.76 894.27 738311 59.07 58.45 40.67 5.29 83.66 93 72 120 138 76 629.44 384.47 68.99 849.34 1155.51 688.88 898.33 742690 58.68 56.60 53.33 6.90 83.43 97 71 106 126 77 636.54 397.86 73.59 865.92 1156.29 689.02 867.36 760113 58.55 56.54 87.90 6.62 OpenBenchmarking.org
x264 H.264 Video Encoding OpenBenchmarking.org Frames Per Second, More Is Better x264 2014-01-09 H.264 Video Encoding bdver3 bdver2 bdver1 barcelona k8 20 40 60 80 100 SE +/- 0.25, N = 5 SE +/- 0.64, N = 5 SE +/- 0.73, N = 5 SE +/- 0.46, N = 5 SE +/- 0.71, N = 5 83.83 83.85 84.14 83.66 83.43 -march=bdver3 -march=bdver2 -march=bdver1 -march=barcelona -march=k8 1. (CC) gcc options: -ldl -m64 -lm -lpthread -O3 -ffast-math -std=gnu99 -fomit-frame-pointer -fno-tree-vectorize
GraphicsMagick Operation: Blur OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Blur bdver3 bdver2 bdver1 barcelona k8 20 40 60 80 100 SE +/- 0.00, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 106 110 108 93 97 -march=bdver3 -march=bdver2 -march=bdver1 -march=barcelona -march=k8 1. (CC) gcc options: -std=gnu99 -fopenmp -O3 -pthread -ljbig -lwebp -ljpeg -lXext -lSM -lICE -lX11 -llzma -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Sharpen bdver3 bdver2 bdver1 barcelona k8 20 40 60 80 100 SE +/- 0.00, N = 3 SE +/- 0.33, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 81 81 87 72 71 -march=bdver3 -march=bdver2 -march=bdver1 -march=barcelona -march=k8 1. (CC) gcc options: -std=gnu99 -fopenmp -O3 -pthread -ljbig -lwebp -ljpeg -lXext -lSM -lICE -lX11 -llzma -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: Resizing OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Resizing bdver3 bdver2 bdver1 barcelona k8 30 60 90 120 150 SE +/- 0.33, N = 3 SE +/- 0.00, N = 3 SE +/- 0.33, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 133 133 133 120 106 -march=bdver3 -march=bdver2 -march=bdver1 -march=barcelona -march=k8 1. (CC) gcc options: -std=gnu99 -fopenmp -O3 -pthread -ljbig -lwebp -ljpeg -lXext -lSM -lICE -lX11 -llzma -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: HWB Color Space OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: HWB Color Space bdver3 bdver2 bdver1 barcelona k8 30 60 90 120 150 SE +/- 0.00, N = 3 SE +/- 0.33, N = 3 SE +/- 0.00, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 139 139 139 138 126 -march=bdver3 -march=bdver2 -march=bdver1 -march=barcelona -march=k8 1. (CC) gcc options: -std=gnu99 -fopenmp -O3 -pthread -ljbig -lwebp -ljpeg -lXext -lSM -lICE -lX11 -llzma -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: Local Adaptive Thresholding OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Local Adaptive Thresholding bdver3 bdver2 bdver1 barcelona k8 20 40 60 80 100 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.33, N = 3 SE +/- 0.00, N = 3 80 81 81 76 77 -march=bdver3 -march=bdver2 -march=bdver1 -march=barcelona -march=k8 1. (CC) gcc options: -std=gnu99 -fopenmp -O3 -pthread -ljbig -lwebp -ljpeg -lXext -lSM -lICE -lX11 -llzma -lxml2 -lz -lm -lpthread
SciMark Computational Test: Composite OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Composite bdver3 bdver2 bdver1 barcelona k8 140 280 420 560 700 SE +/- 1.36, N = 4 SE +/- 1.36, N = 4 SE +/- 1.89, N = 4 SE +/- 4.49, N = 4 SE +/- 1.68, N = 4 641.05 644.89 640.56 629.44 636.54 -march=bdver3 -march=bdver2 -march=bdver1 -march=barcelona -march=k8 1. (CXX) g++ options: -O3
SciMark Computational Test: Monte Carlo OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo bdver3 bdver2 bdver1 barcelona k8 90 180 270 360 450 SE +/- 10.39, N = 4 SE +/- 1.61, N = 4 SE +/- 0.77, N = 4 SE +/- 0.60, N = 4 SE +/- 7.32, N = 4 413.64 423.28 423.73 384.47 397.86 -march=bdver3 -march=bdver2 -march=bdver1 -march=barcelona -march=k8 1. (CXX) g++ options: -O3
SciMark Computational Test: Fast Fourier Transform OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform bdver3 bdver2 bdver1 barcelona k8 16 32 48 64 80 SE +/- 0.16, N = 4 SE +/- 0.10, N = 4 SE +/- 1.36, N = 4 SE +/- 0.94, N = 4 SE +/- 0.22, N = 4 70.77 71.15 68.50 68.99 73.59 -march=bdver3 -march=bdver2 -march=bdver1 -march=barcelona -march=k8 1. (CXX) g++ options: -O3
SciMark Computational Test: Sparse Matrix Multiply OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply bdver3 bdver2 bdver1 barcelona k8 200 400 600 800 1000 SE +/- 7.27, N = 4 SE +/- 2.24, N = 4 SE +/- 9.94, N = 4 SE +/- 15.60, N = 4 SE +/- 4.11, N = 4 866.81 877.91 860.41 849.34 865.92 -march=bdver3 -march=bdver2 -march=bdver1 -march=barcelona -march=k8 1. (CXX) g++ options: -O3
SciMark Computational Test: Dense LU Matrix Factorization OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization bdver3 bdver2 bdver1 barcelona k8 300 600 900 1200 1500 SE +/- 2.87, N = 4 SE +/- 3.97, N = 4 SE +/- 1.67, N = 4 SE +/- 5.67, N = 4 SE +/- 6.32, N = 4 1165.94 1164.64 1162.41 1155.51 1156.29 -march=bdver3 -march=bdver2 -march=bdver1 -march=barcelona -march=k8 1. (CXX) g++ options: -O3
SciMark Computational Test: Jacobi Successive Over-Relaxation OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation bdver3 bdver2 bdver1 barcelona k8 150 300 450 600 750 SE +/- 0.38, N = 4 SE +/- 0.80, N = 4 SE +/- 0.38, N = 4 SE +/- 0.08, N = 4 SE +/- 0.12, N = 4 688.08 687.47 687.76 688.88 689.02 -march=bdver3 -march=bdver2 -march=bdver1 -march=barcelona -march=k8 1. (CXX) g++ options: -O3
Himeno Benchmark Poisson Pressure Solver OpenBenchmarking.org MFLOPS, More Is Better Himeno Benchmark 3.0 Poisson Pressure Solver bdver3 bdver2 bdver1 barcelona k8 200 400 600 800 1000 SE +/- 3.10, N = 3 SE +/- 1.51, N = 3 SE +/- 9.68, N = 3 SE +/- 0.19, N = 3 SE +/- 6.79, N = 3 902.98 905.30 894.27 898.33 867.36 -march=bdver3 -march=bdver2 -march=bdver1 -march=barcelona -march=k8 1. (CC) gcc options: -O3
TSCP AI Chess Performance OpenBenchmarking.org Nodes Per Second, More Is Better TSCP 1.81 AI Chess Performance bdver3 bdver2 bdver1 barcelona k8 160K 320K 480K 640K 800K SE +/- 198.20, N = 5 SE +/- 671.07, N = 5 SE +/- 699.90, N = 5 SE +/- 600.77, N = 5 SE +/- 257.20, N = 5 739101 738707 738311 742690 760113 -march=bdver3 -march=bdver2 -march=bdver1 -march=barcelona -march=k8 1. (CC) gcc options: -O3
Timed Apache Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Apache Compilation 2.4.7 Time To Compile bdver3 bdver2 bdver1 barcelona k8 13 26 39 52 65 SE +/- 0.12, N = 3 SE +/- 0.12, N = 3 SE +/- 0.21, N = 3 SE +/- 0.15, N = 3 SE +/- 0.19, N = 3 58.83 58.91 59.07 58.68 58.55
Timed PHP Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed PHP Compilation 5.2.9 Time To Compile bdver3 bdver2 bdver1 barcelona k8 13 26 39 52 65 SE +/- 0.01, N = 3 SE +/- 0.07, N = 3 SE +/- 0.10, N = 3 SE +/- 0.04, N = 3 SE +/- 0.06, N = 3 58.48 58.54 58.45 56.60 56.54 -march=bdver3 -march=bdver2 -march=bdver1 -march=barcelona -march=k8 1. (CC) gcc options: -O3 -pedantic -ldl -lz -lm
C-Ray Total Time OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time bdver3 bdver2 bdver1 barcelona k8 20 40 60 80 100 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 SE +/- 0.09, N = 3 SE +/- 0.02, N = 3 40.54 40.55 40.67 53.33 87.90 -march=bdver3 -march=bdver2 -march=bdver1 -march=barcelona -march=k8 1. (CC) gcc options: -lm -lpthread -O3
FLAC Audio Encoding WAV To FLAC OpenBenchmarking.org Seconds, Fewer Is Better FLAC Audio Encoding 1.3.0 WAV To FLAC bdver3 bdver2 bdver1 barcelona k8 2 4 6 8 10 SE +/- 0.02, N = 5 SE +/- 0.01, N = 5 SE +/- 0.02, N = 5 SE +/- 0.01, N = 5 SE +/- 0.02, N = 5 5.52 5.47 5.29 6.90 6.62 -march=bdver3 -march=bdver2 -march=bdver1 -march=barcelona -march=k8 1. (CXX) g++ options: -O3 -fvisibility=hidden -logg -lm
Phoronix Test Suite v10.8.5