LLVM 3.4 Clang Compiler AMD Kaveri Benchmarking

Benchmarks by Michael Larabel for a future article on Phoronix.com looking at AMD Kaveri A10-7850K compiler performance on GCC 4.8 and GCC 4.9 compilers and LLVM Clang 3.4 on Kaveri.

HTML result view exported from: https://openbenchmarking.org/result/1401278-PL-CLANGKAVE26&grt.

LLVM 3.4 Clang Compiler AMD Kaveri BenchmarkingProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay DriverCompilerFile-SystemScreen ResolutionGCC 4.9.0 20140126LLVM Clang 3.4AMD A10-7850K APU with Radeon R7 @ 3.70GHz (4 Cores)Gigabyte F2A88XM-D3HAMD Device 14227168MB120GB KINGSTON SV300S3AMD Kaveri 1024MBATI R6xx HDMITSB-TVRealtek RTL8111/8168/8411Ubuntu 14.043.13.0-5-generic (x86_64)Unity 7.1.2radeon 7.2.99GCC 4.9.0 20140126ext41920x1080Clang 3.4 + LLVM 3.4OpenBenchmarking.orgKernel Details- radeon.dpm=1Compiler Details- GCC 4.9.0 20140126: --disable-multilib --enable-checking=release --enable-languages=c,c++,fortran - LLVM Clang 3.4: Optimized build; Built Jan 27 2014 (16:11:42); Default target: x86_64-unknown-linux-gnu; Host CPU: bdver3 Processor Details- Scaling Governor: acpi-cpufreq ondemand

LLVM 3.4 Clang Compiler AMD Kaveri Benchmarkingapache: Static Web Page Servingblake2: Phoronix Test Suite v5.0.0m0bullet: Raytestsbullet: 3000 Fallbullet: 1000 Stackbullet: 1000 Convexbullet: Prim Trimeshbullet: Convex Trimeshc-ray: Total Timecrafty: Elapsed Timeffmpeg: H.264 HD To NTSC DVfhourstones: Complex Connect-4 Solvinghint: FLOAThimeno: Poisson Pressure Solverlammps: Rhodopsin Proteinpolybench-c: 3 Matrix Multiplicationsscimark2: Compositescimark2: Monte Carloscimark2: Fast Fourier Transformscimark2: Sparse Matrix Multiplyscimark2: Dense LU Matrix Factorizationscimark2: Jacobi Successive Over-Relaxationbuild-apache: Time To Compilehmmer: Pfam Database Searchbuild-php: Time To Compiletscp: AI Chess Performancex264: H.264 Video EncodingGCC 4.9.0 20140126LLVM Clang 3.417507.226.804.528.569.437.201.541.7640.50103.3521.379569.50239745129.57903.7859.71126.07641.01420.0270.69865.541164.81683.9958.9319.3458.3973791982.3717173.106.785.159.2611.028.981.652.0067.08105.0121.419557.57145204373.31886.7855.30122.80738.28401.3367.72885.691257.711078.9737.9819.8733.2260578381.91OpenBenchmarking.org

Apache Benchmark

Static Web Page Serving

OpenBenchmarking.orgRequests Per Second, More Is BetterApache Benchmark 2.4.7Static Web Page ServingGCC 4.9.0 20140126LLVM Clang 3.44K8K12K16K20KSE +/- 172.83, N = 3SE +/- 207.92, N = 317507.2217173.101. (CC) gcc options: -shared -fPIC -pthread -O3 -march=native

BLAKE2

Phoronix Test Suite v5.0.0m0

OpenBenchmarking.orgCycles Per Byte, Fewer Is BetterBLAKE2 20130131Phoronix Test Suite v5.0.0m0GCC 4.9.0 20140126LLVM Clang 3.4246810SE +/- 0.00, N = 3SE +/- 0.00, N = 36.806.781. (CC) gcc options: -std=gnu99 -O3 -march=native -lcrypto -lz

Bullet Physics Engine

Test: Raytests

OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: RaytestsGCC 4.9.0 20140126LLVM Clang 3.41.15882.31763.47644.63525.794SE +/- 0.01, N = 3SE +/- 0.01, N = 34.525.151. (CXX) g++ options: -O3 -march=native -rdynamic -lglut -lGL -lGLU

Bullet Physics Engine

Test: 3000 Fall

OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: 3000 FallGCC 4.9.0 20140126LLVM Clang 3.43691215SE +/- 0.01, N = 3SE +/- 0.05, N = 38.569.261. (CXX) g++ options: -O3 -march=native -rdynamic -lglut -lGL -lGLU

Bullet Physics Engine

Test: 1000 Stack

OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: 1000 StackGCC 4.9.0 20140126LLVM Clang 3.43691215SE +/- 0.04, N = 3SE +/- 0.06, N = 39.4311.021. (CXX) g++ options: -O3 -march=native -rdynamic -lglut -lGL -lGLU

Bullet Physics Engine

Test: 1000 Convex

OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: 1000 ConvexGCC 4.9.0 20140126LLVM Clang 3.43691215SE +/- 0.02, N = 3SE +/- 0.00, N = 37.208.981. (CXX) g++ options: -O3 -march=native -rdynamic -lglut -lGL -lGLU

Bullet Physics Engine

Test: Prim Trimesh

OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: Prim TrimeshGCC 4.9.0 20140126LLVM Clang 3.40.37130.74261.11391.48521.8565SE +/- 0.01, N = 3SE +/- 0.00, N = 31.541.651. (CXX) g++ options: -O3 -march=native -rdynamic -lglut -lGL -lGLU

Bullet Physics Engine

Test: Convex Trimesh

OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: Convex TrimeshGCC 4.9.0 20140126LLVM Clang 3.40.450.91.351.82.25SE +/- 0.01, N = 3SE +/- 0.00, N = 31.762.001. (CXX) g++ options: -O3 -march=native -rdynamic -lglut -lGL -lGLU

C-Ray

Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total TimeGCC 4.9.0 20140126LLVM Clang 3.41530456075SE +/- 0.02, N = 3SE +/- 0.03, N = 340.5067.081. (CC) gcc options: -lm -lpthread -O3 -march=native

Crafty

Elapsed Time

OpenBenchmarking.orgSeconds, Fewer Is BetterCrafty 23.4Elapsed TimeGCC 4.9.0 20140126LLVM Clang 3.420406080100SE +/- 0.26, N = 3SE +/- 0.18, N = 3103.35105.011. (CC) gcc options: -lstdc++ -lm

FFmpeg

H.264 HD To NTSC DV

OpenBenchmarking.orgSeconds, Fewer Is BetterFFmpeg 2.1.1H.264 HD To NTSC DVGCC 4.9.0 20140126LLVM Clang 3.4510152025SE +/- 0.14, N = 3SE +/- 0.03, N = 321.3721.41-fno-tree-vectorize -MF -MT-Qunused-arguments1. (CC) gcc options: -lavdevice -lavfilter -lavformat -lavcodec -lswresample -lswscale -lavutil -ldl -lasound -lSDL -lm -pthread -O3 -march=native -std=c99 -fomit-frame-pointer -fno-math-errno -fno-signed-zeros -MMD

Fhourstones

Complex Connect-4 Solving

OpenBenchmarking.orgKpos / sec, More Is BetterFhourstones 3.1Complex Connect-4 SolvingGCC 4.9.0 20140126LLVM Clang 3.42K4K6K8K10KSE +/- 15.15, N = 3SE +/- 3.26, N = 39569.509557.571. (CC) gcc options: -O3

Hierarchical INTegration

Test: FLOAT

OpenBenchmarking.orgQUIPs, More Is BetterHierarchical INTegration 1.0Test: FLOATGCC 4.9.0 20140126LLVM Clang 3.450M100M150M200M250MSE +/- 1044977.11, N = 3SE +/- 276317.42, N = 3239745129.57145204373.311. (CC) gcc options: -O3 -march=native -lm

Himeno Benchmark

Poisson Pressure Solver

OpenBenchmarking.orgMFLOPS, More Is BetterHimeno Benchmark 3.0Poisson Pressure SolverGCC 4.9.0 20140126LLVM Clang 3.42004006008001000SE +/- 2.10, N = 3SE +/- 0.97, N = 3903.78886.781. (CC) gcc options: -O3 -march=native

LAMMPS Molecular Dynamics Simulator

Test: Rhodopsin Protein

OpenBenchmarking.orgLoop Time, Fewer Is BetterLAMMPS Molecular Dynamics Simulator 1.0Test: Rhodopsin ProteinGCC 4.9.0 20140126LLVM Clang 3.41326395265SE +/- 0.17, N = 3SE +/- 0.14, N = 359.7155.301. (CXX) g++ options: -lfftw -lmpich

PolyBench-C

Test: 3 Matrix Multiplications

OpenBenchmarking.orgSeconds, Fewer Is BetterPolyBench-C 3.2Test: 3 Matrix MultiplicationsGCC 4.9.0 20140126LLVM Clang 3.4306090120150SE +/- 0.07, N = 3SE +/- 0.18, N = 3126.07122.801. (CC) gcc options: -O3 -march=native

SciMark

Computational Test: Composite

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: CompositeGCC 4.9.0 20140126LLVM Clang 3.4160320480640800SE +/- 0.78, N = 4SE +/- 1.36, N = 4641.01738.281. (CXX) g++ options: -O3 -march=native

SciMark

Computational Test: Monte Carlo

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Monte CarloGCC 4.9.0 20140126LLVM Clang 3.490180270360450SE +/- 4.58, N = 4SE +/- 2.09, N = 4420.02401.331. (CXX) g++ options: -O3 -march=native

SciMark

Computational Test: Fast Fourier Transform

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Fast Fourier TransformGCC 4.9.0 20140126LLVM Clang 3.41632486480SE +/- 0.69, N = 4SE +/- 0.10, N = 470.6967.721. (CXX) g++ options: -O3 -march=native

SciMark

Computational Test: Sparse Matrix Multiply

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Sparse Matrix MultiplyGCC 4.9.0 20140126LLVM Clang 3.42004006008001000SE +/- 6.95, N = 4SE +/- 3.68, N = 4865.54885.691. (CXX) g++ options: -O3 -march=native

SciMark

Computational Test: Dense LU Matrix Factorization

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Dense LU Matrix FactorizationGCC 4.9.0 20140126LLVM Clang 3.430060090012001500SE +/- 1.14, N = 4SE +/- 2.93, N = 41164.811257.711. (CXX) g++ options: -O3 -march=native

SciMark

Computational Test: Jacobi Successive Over-Relaxation

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Jacobi Successive Over-RelaxationGCC 4.9.0 20140126LLVM Clang 3.42004006008001000SE +/- 0.06, N = 4SE +/- 0.11, N = 4683.991078.971. (CXX) g++ options: -O3 -march=native

Timed Apache Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Apache Compilation 2.4.7Time To CompileGCC 4.9.0 20140126LLVM Clang 3.41326395265SE +/- 0.14, N = 3SE +/- 0.06, N = 358.9337.98

Timed HMMer Search

Pfam Database Search

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed HMMer Search 2.3.2Pfam Database SearchGCC 4.9.0 20140126LLVM Clang 3.4510152025SE +/- 0.21, N = 3SE +/- 0.33, N = 319.3419.871. (CC) gcc options: -O3 -march=native -pthread -lhmmer -lsquid -lm

Timed PHP Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed PHP Compilation 5.2.9Time To CompileGCC 4.9.0 20140126LLVM Clang 3.41326395265SE +/- 0.12, N = 3SE +/- 0.05, N = 358.3933.221. (CC) gcc options: -O3 -march=native -pedantic -ldl -lz -lm

TSCP

AI Chess Performance

OpenBenchmarking.orgNodes Per Second, More Is BetterTSCP 1.81AI Chess PerformanceGCC 4.9.0 20140126LLVM Clang 3.4160K320K480K640K800KSE +/- 1107.98, N = 5SE +/- 326.52, N = 57379196057831. (CC) gcc options: -O3 -march=native

x264

H.264 Video Encoding

OpenBenchmarking.orgFrames Per Second, More Is Betterx264 2014-01-09H.264 Video EncodingGCC 4.9.0 20140126LLVM Clang 3.420406080100SE +/- 0.71, N = 5SE +/- 0.51, N = 582.3781.911. (CC) gcc options: -ldl -m64 -lm -lpthread -O3 -ffast-math -march=native -std=gnu99 -fomit-frame-pointer -fno-tree-vectorize


Phoronix Test Suite v10.8.4