GCC 4.9 Compiler Optimization Tuning AMD Kaveri

AMD Steamroller CPU Cores on AMD A10-7850K Kaveri APU compiler optimization tuning with various march= values. Benchmarks by Michael Larabel for a future article on Phoronix.com.

HTML result view exported from: https://openbenchmarking.org/result/1401282-PL-GCC49COMP74&sro&grr.

GCC 4.9 Compiler Optimization Tuning AMD KaveriProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay DriverCompilerFile-SystemScreen Resolutionk8barcelonabdver1bdver2bdver3AMD A10-7850K APU with Radeon R7 @ 3.70GHz (4 Cores)Gigabyte F2A88XM-D3HAMD Device 14227168MB120GB KINGSTON SV300S3AMD Kaveri 1024MBATI R6xx HDMITSB-TVRealtek RTL8111/8168/8411Ubuntu 14.043.13.0-5-generic (x86_64)Unity 7.1.2radeon 7.2.99GCC 4.9.0 20140126 + Clang 3.4 + LLVM 3.4ext41920x1080OpenBenchmarking.orgKernel Details- radeon.dpm=1Compiler Details- --disable-multilib --enable-checking=release --enable-languages=c,c++,fortranProcessor Details- Scaling Governor: acpi-cpufreq ondemand

GCC 4.9 Compiler Optimization Tuning AMD Kaveriencode-flac: WAV To FLACc-ray: Total Timebuild-php: Time To Compilebuild-apache: Time To Compilehimeno: Poisson Pressure Solvergraphics-magick: Local Adaptive Thresholdinggraphics-magick: HWB Color Spacegraphics-magick: Resizinggraphics-magick: Sharpengraphics-magick: Blurx264: H.264 Video Encodingtscp: AI Chess Performancescimark2: Jacobi Successive Over-Relaxationscimark2: Dense LU Matrix Factorizationscimark2: Sparse Matrix Multiplyscimark2: Fast Fourier Transformscimark2: Monte Carloscimark2: Compositek8barcelonabdver1bdver2bdver36.6287.9056.5458.55867.3677126106719783.43760113689.021156.29865.9273.59397.86636.546.9053.3356.6058.68898.3376138120729383.66742690688.881155.51849.3468.99384.47629.445.2940.6758.4559.07894.27811391338710884.14738311687.761162.41860.4168.50423.73640.565.4740.5558.5458.91905.30811391338111083.85738707687.471164.64877.9171.15423.28644.895.5240.5458.4858.83902.98801391338110683.83739101688.081165.94866.8170.77413.64641.05OpenBenchmarking.org

FLAC Audio Encoding

WAV To FLAC

OpenBenchmarking.orgSeconds, Fewer Is BetterFLAC Audio Encoding 1.3.0WAV To FLACbarcelonabdver1bdver2bdver3k8246810SE +/- 0.01, N = 5SE +/- 0.02, N = 5SE +/- 0.01, N = 5SE +/- 0.02, N = 5SE +/- 0.02, N = 56.905.295.475.526.62-march=barcelona-march=bdver1-march=bdver2-march=bdver3-march=k81. (CXX) g++ options: -O3 -fvisibility=hidden -logg -lm

C-Ray

Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total Timebarcelonabdver1bdver2bdver3k820406080100SE +/- 0.09, N = 3SE +/- 0.03, N = 3SE +/- 0.03, N = 3SE +/- 0.07, N = 3SE +/- 0.02, N = 353.3340.6740.5540.5487.90-march=barcelona-march=bdver1-march=bdver2-march=bdver3-march=k81. (CC) gcc options: -lm -lpthread -O3

Timed PHP Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed PHP Compilation 5.2.9Time To Compilebarcelonabdver1bdver2bdver3k81326395265SE +/- 0.04, N = 3SE +/- 0.10, N = 3SE +/- 0.07, N = 3SE +/- 0.01, N = 3SE +/- 0.06, N = 356.6058.4558.5458.4856.54-march=barcelona-march=bdver1-march=bdver2-march=bdver3-march=k81. (CC) gcc options: -O3 -pedantic -ldl -lz -lm

Timed Apache Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Apache Compilation 2.4.7Time To Compilebarcelonabdver1bdver2bdver3k81326395265SE +/- 0.15, N = 3SE +/- 0.21, N = 3SE +/- 0.12, N = 3SE +/- 0.12, N = 3SE +/- 0.19, N = 358.6859.0758.9158.8358.55

Himeno Benchmark

Poisson Pressure Solver

OpenBenchmarking.orgMFLOPS, More Is BetterHimeno Benchmark 3.0Poisson Pressure Solverbarcelonabdver1bdver2bdver3k82004006008001000SE +/- 0.19, N = 3SE +/- 9.68, N = 3SE +/- 1.51, N = 3SE +/- 3.10, N = 3SE +/- 6.79, N = 3898.33894.27905.30902.98867.36-march=barcelona-march=bdver1-march=bdver2-march=bdver3-march=k81. (CC) gcc options: -O3

GraphicsMagick

Operation: Local Adaptive Thresholding

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.19Operation: Local Adaptive Thresholdingbarcelonabdver1bdver2bdver3k820406080100SE +/- 0.33, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 37681818077-march=barcelona-march=bdver1-march=bdver2-march=bdver3-march=k81. (CC) gcc options: -std=gnu99 -fopenmp -O3 -pthread -ljbig -lwebp -ljpeg -lXext -lSM -lICE -lX11 -llzma -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: HWB Color Space

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.19Operation: HWB Color Spacebarcelonabdver1bdver2bdver3k8306090120150SE +/- 0.33, N = 3SE +/- 0.00, N = 3SE +/- 0.33, N = 3SE +/- 0.00, N = 3SE +/- 0.33, N = 3138139139139126-march=barcelona-march=bdver1-march=bdver2-march=bdver3-march=k81. (CC) gcc options: -std=gnu99 -fopenmp -O3 -pthread -ljbig -lwebp -ljpeg -lXext -lSM -lICE -lX11 -llzma -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: Resizing

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.19Operation: Resizingbarcelonabdver1bdver2bdver3k8306090120150SE +/- 0.00, N = 3SE +/- 0.33, N = 3SE +/- 0.00, N = 3SE +/- 0.33, N = 3SE +/- 0.00, N = 3120133133133106-march=barcelona-march=bdver1-march=bdver2-march=bdver3-march=k81. (CC) gcc options: -std=gnu99 -fopenmp -O3 -pthread -ljbig -lwebp -ljpeg -lXext -lSM -lICE -lX11 -llzma -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: Sharpen

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.19Operation: Sharpenbarcelonabdver1bdver2bdver3k820406080100SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.33, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 37287818171-march=barcelona-march=bdver1-march=bdver2-march=bdver3-march=k81. (CC) gcc options: -std=gnu99 -fopenmp -O3 -pthread -ljbig -lwebp -ljpeg -lXext -lSM -lICE -lX11 -llzma -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: Blur

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.19Operation: Blurbarcelonabdver1bdver2bdver3k820406080100SE +/- 0.00, N = 3SE +/- 0.33, N = 3SE +/- 0.33, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 39310811010697-march=barcelona-march=bdver1-march=bdver2-march=bdver3-march=k81. (CC) gcc options: -std=gnu99 -fopenmp -O3 -pthread -ljbig -lwebp -ljpeg -lXext -lSM -lICE -lX11 -llzma -lxml2 -lz -lm -lpthread

x264

H.264 Video Encoding

OpenBenchmarking.orgFrames Per Second, More Is Betterx264 2014-01-09H.264 Video Encodingbarcelonabdver1bdver2bdver3k820406080100SE +/- 0.46, N = 5SE +/- 0.73, N = 5SE +/- 0.64, N = 5SE +/- 0.25, N = 5SE +/- 0.71, N = 583.6684.1483.8583.8383.43-march=barcelona-march=bdver1-march=bdver2-march=bdver3-march=k81. (CC) gcc options: -ldl -m64 -lm -lpthread -O3 -ffast-math -std=gnu99 -fomit-frame-pointer -fno-tree-vectorize

TSCP

AI Chess Performance

OpenBenchmarking.orgNodes Per Second, More Is BetterTSCP 1.81AI Chess Performancebarcelonabdver1bdver2bdver3k8160K320K480K640K800KSE +/- 600.77, N = 5SE +/- 699.90, N = 5SE +/- 671.07, N = 5SE +/- 198.20, N = 5SE +/- 257.20, N = 5742690738311738707739101760113-march=barcelona-march=bdver1-march=bdver2-march=bdver3-march=k81. (CC) gcc options: -O3

SciMark

Computational Test: Jacobi Successive Over-Relaxation

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Jacobi Successive Over-Relaxationbarcelonabdver1bdver2bdver3k8150300450600750SE +/- 0.08, N = 4SE +/- 0.38, N = 4SE +/- 0.80, N = 4SE +/- 0.38, N = 4SE +/- 0.12, N = 4688.88687.76687.47688.08689.02-march=barcelona-march=bdver1-march=bdver2-march=bdver3-march=k81. (CXX) g++ options: -O3

SciMark

Computational Test: Dense LU Matrix Factorization

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Dense LU Matrix Factorizationbarcelonabdver1bdver2bdver3k830060090012001500SE +/- 5.67, N = 4SE +/- 1.67, N = 4SE +/- 3.97, N = 4SE +/- 2.87, N = 4SE +/- 6.32, N = 41155.511162.411164.641165.941156.29-march=barcelona-march=bdver1-march=bdver2-march=bdver3-march=k81. (CXX) g++ options: -O3

SciMark

Computational Test: Sparse Matrix Multiply

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Sparse Matrix Multiplybarcelonabdver1bdver2bdver3k82004006008001000SE +/- 15.60, N = 4SE +/- 9.94, N = 4SE +/- 2.24, N = 4SE +/- 7.27, N = 4SE +/- 4.11, N = 4849.34860.41877.91866.81865.92-march=barcelona-march=bdver1-march=bdver2-march=bdver3-march=k81. (CXX) g++ options: -O3

SciMark

Computational Test: Fast Fourier Transform

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Fast Fourier Transformbarcelonabdver1bdver2bdver3k81632486480SE +/- 0.94, N = 4SE +/- 1.36, N = 4SE +/- 0.10, N = 4SE +/- 0.16, N = 4SE +/- 0.22, N = 468.9968.5071.1570.7773.59-march=barcelona-march=bdver1-march=bdver2-march=bdver3-march=k81. (CXX) g++ options: -O3

SciMark

Computational Test: Monte Carlo

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Monte Carlobarcelonabdver1bdver2bdver3k890180270360450SE +/- 0.60, N = 4SE +/- 0.77, N = 4SE +/- 1.61, N = 4SE +/- 10.39, N = 4SE +/- 7.32, N = 4384.47423.73423.28413.64397.86-march=barcelona-march=bdver1-march=bdver2-march=bdver3-march=k81. (CXX) g++ options: -O3

SciMark

Computational Test: Composite

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Compositebarcelonabdver1bdver2bdver3k8140280420560700SE +/- 4.49, N = 4SE +/- 1.89, N = 4SE +/- 1.36, N = 4SE +/- 1.36, N = 4SE +/- 1.68, N = 4629.44640.56644.89641.05636.54-march=barcelona-march=bdver1-march=bdver2-march=bdver3-march=k81. (CXX) g++ options: -O3


Phoronix Test Suite v10.8.5