LLVM 3.4 Clang Compiler AMD Kaveri Benchmarking Benchmarks by Michael Larabel for a future article on Phoronix.com looking at AMD Kaveri A10-7850K compiler performance on GCC 4.8 and GCC 4.9 compilers and LLVM Clang 3.4 on Kaveri.
HTML result view exported from: https://openbenchmarking.org/result/1401278-PL-CLANGKAVE26&grw&sro .
LLVM 3.4 Clang Compiler AMD Kaveri Benchmarking Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Driver Compiler File-System Screen Resolution GCC 4.9.0 20140126 LLVM Clang 3.4 AMD A10-7850K APU with Radeon R7 @ 3.70GHz (4 Cores) Gigabyte F2A88XM-D3H AMD Device 1422 7168MB 120GB KINGSTON SV300S3 AMD Kaveri 1024MB ATI R6xx HDMI TSB-TV Realtek RTL8111/8168/8411 Ubuntu 14.04 3.13.0-5-generic (x86_64) Unity 7.1.2 radeon 7.2.99 GCC 4.9.0 20140126 ext4 1920x1080 Clang 3.4 + LLVM 3.4 OpenBenchmarking.org Kernel Details - radeon.dpm=1 Compiler Details - GCC 4.9.0 20140126: --disable-multilib --enable-checking=release --enable-languages=c,c++,fortran - LLVM Clang 3.4: Optimized build; Built Jan 27 2014 (16:11:42); Default target: x86_64-unknown-linux-gnu; Host CPU: bdver3 Processor Details - Scaling Governor: acpi-cpufreq ondemand
LLVM 3.4 Clang Compiler AMD Kaveri Benchmarking bullet: Raytests bullet: 3000 Fall bullet: 1000 Stack bullet: 1000 Convex bullet: Prim Trimesh bullet: Convex Trimesh tscp: AI Chess Performance scimark2: Composite scimark2: Monte Carlo scimark2: Fast Fourier Transform scimark2: Sparse Matrix Multiply scimark2: Dense LU Matrix Factorization scimark2: Jacobi Successive Over-Relaxation crafty: Elapsed Time hint: FLOAT blake2: Phoronix Test Suite v5.0.0m0 hmmer: Pfam Database Search himeno: Poisson Pressure Solver lammps: Rhodopsin Protein build-apache: Time To Compile build-php: Time To Compile x264: H.264 Video Encoding c-ray: Total Time ffmpeg: H.264 HD To NTSC DV apache: Static Web Page Serving fhourstones: Complex Connect-4 Solving polybench-c: 3 Matrix Multiplications GCC 4.9.0 20140126 LLVM Clang 3.4 4.52 8.56 9.43 7.20 1.54 1.76 737919 641.01 420.02 70.69 865.54 1164.81 683.99 103.35 239745129.57 6.80 19.34 903.78 59.71 58.93 58.39 82.37 40.50 21.37 17507.22 9569.50 126.07 5.15 9.26 11.02 8.98 1.65 2.00 605783 738.28 401.33 67.72 885.69 1257.71 1078.97 105.01 145204373.31 6.78 19.87 886.78 55.30 37.98 33.22 81.91 67.08 21.41 17173.10 9557.57 122.80 OpenBenchmarking.org
Bullet Physics Engine Test: Raytests OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: Raytests GCC 4.9.0 20140126 LLVM Clang 3.4 1.1588 2.3176 3.4764 4.6352 5.794 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 4.52 5.15 1. (CXX) g++ options: -O3 -march=native -rdynamic -lglut -lGL -lGLU
Bullet Physics Engine Test: 3000 Fall OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: 3000 Fall GCC 4.9.0 20140126 LLVM Clang 3.4 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 8.56 9.26 1. (CXX) g++ options: -O3 -march=native -rdynamic -lglut -lGL -lGLU
Bullet Physics Engine Test: 1000 Stack OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: 1000 Stack GCC 4.9.0 20140126 LLVM Clang 3.4 3 6 9 12 15 SE +/- 0.04, N = 3 SE +/- 0.06, N = 3 9.43 11.02 1. (CXX) g++ options: -O3 -march=native -rdynamic -lglut -lGL -lGLU
Bullet Physics Engine Test: 1000 Convex OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: 1000 Convex GCC 4.9.0 20140126 LLVM Clang 3.4 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 7.20 8.98 1. (CXX) g++ options: -O3 -march=native -rdynamic -lglut -lGL -lGLU
Bullet Physics Engine Test: Prim Trimesh OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: Prim Trimesh GCC 4.9.0 20140126 LLVM Clang 3.4 0.3713 0.7426 1.1139 1.4852 1.8565 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 1.54 1.65 1. (CXX) g++ options: -O3 -march=native -rdynamic -lglut -lGL -lGLU
Bullet Physics Engine Test: Convex Trimesh OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: Convex Trimesh GCC 4.9.0 20140126 LLVM Clang 3.4 0.45 0.9 1.35 1.8 2.25 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 1.76 2.00 1. (CXX) g++ options: -O3 -march=native -rdynamic -lglut -lGL -lGLU
TSCP AI Chess Performance OpenBenchmarking.org Nodes Per Second, More Is Better TSCP 1.81 AI Chess Performance GCC 4.9.0 20140126 LLVM Clang 3.4 160K 320K 480K 640K 800K SE +/- 1107.98, N = 5 SE +/- 326.52, N = 5 737919 605783 1. (CC) gcc options: -O3 -march=native
SciMark Computational Test: Composite OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Composite GCC 4.9.0 20140126 LLVM Clang 3.4 160 320 480 640 800 SE +/- 0.78, N = 4 SE +/- 1.36, N = 4 641.01 738.28 1. (CXX) g++ options: -O3 -march=native
SciMark Computational Test: Monte Carlo OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo GCC 4.9.0 20140126 LLVM Clang 3.4 90 180 270 360 450 SE +/- 4.58, N = 4 SE +/- 2.09, N = 4 420.02 401.33 1. (CXX) g++ options: -O3 -march=native
SciMark Computational Test: Fast Fourier Transform OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform GCC 4.9.0 20140126 LLVM Clang 3.4 16 32 48 64 80 SE +/- 0.69, N = 4 SE +/- 0.10, N = 4 70.69 67.72 1. (CXX) g++ options: -O3 -march=native
SciMark Computational Test: Sparse Matrix Multiply OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply GCC 4.9.0 20140126 LLVM Clang 3.4 200 400 600 800 1000 SE +/- 6.95, N = 4 SE +/- 3.68, N = 4 865.54 885.69 1. (CXX) g++ options: -O3 -march=native
SciMark Computational Test: Dense LU Matrix Factorization OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization GCC 4.9.0 20140126 LLVM Clang 3.4 300 600 900 1200 1500 SE +/- 1.14, N = 4 SE +/- 2.93, N = 4 1164.81 1257.71 1. (CXX) g++ options: -O3 -march=native
SciMark Computational Test: Jacobi Successive Over-Relaxation OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation GCC 4.9.0 20140126 LLVM Clang 3.4 200 400 600 800 1000 SE +/- 0.06, N = 4 SE +/- 0.11, N = 4 683.99 1078.97 1. (CXX) g++ options: -O3 -march=native
Crafty Elapsed Time OpenBenchmarking.org Seconds, Fewer Is Better Crafty 23.4 Elapsed Time GCC 4.9.0 20140126 LLVM Clang 3.4 20 40 60 80 100 SE +/- 0.26, N = 3 SE +/- 0.18, N = 3 103.35 105.01 1. (CC) gcc options: -lstdc++ -lm
Hierarchical INTegration Test: FLOAT OpenBenchmarking.org QUIPs, More Is Better Hierarchical INTegration 1.0 Test: FLOAT GCC 4.9.0 20140126 LLVM Clang 3.4 50M 100M 150M 200M 250M SE +/- 1044977.11, N = 3 SE +/- 276317.42, N = 3 239745129.57 145204373.31 1. (CC) gcc options: -O3 -march=native -lm
BLAKE2 Phoronix Test Suite v5.0.0m0 OpenBenchmarking.org Cycles Per Byte, Fewer Is Better BLAKE2 20130131 Phoronix Test Suite v5.0.0m0 GCC 4.9.0 20140126 LLVM Clang 3.4 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 6.80 6.78 1. (CC) gcc options: -std=gnu99 -O3 -march=native -lcrypto -lz
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 2.3.2 Pfam Database Search GCC 4.9.0 20140126 LLVM Clang 3.4 5 10 15 20 25 SE +/- 0.21, N = 3 SE +/- 0.33, N = 3 19.34 19.87 1. (CC) gcc options: -O3 -march=native -pthread -lhmmer -lsquid -lm
Himeno Benchmark Poisson Pressure Solver OpenBenchmarking.org MFLOPS, More Is Better Himeno Benchmark 3.0 Poisson Pressure Solver GCC 4.9.0 20140126 LLVM Clang 3.4 200 400 600 800 1000 SE +/- 2.10, N = 3 SE +/- 0.97, N = 3 903.78 886.78 1. (CC) gcc options: -O3 -march=native
LAMMPS Molecular Dynamics Simulator Test: Rhodopsin Protein OpenBenchmarking.org Loop Time, Fewer Is Better LAMMPS Molecular Dynamics Simulator 1.0 Test: Rhodopsin Protein GCC 4.9.0 20140126 LLVM Clang 3.4 13 26 39 52 65 SE +/- 0.17, N = 3 SE +/- 0.14, N = 3 59.71 55.30 1. (CXX) g++ options: -lfftw -lmpich
Timed Apache Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Apache Compilation 2.4.7 Time To Compile GCC 4.9.0 20140126 LLVM Clang 3.4 13 26 39 52 65 SE +/- 0.14, N = 3 SE +/- 0.06, N = 3 58.93 37.98
Timed PHP Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed PHP Compilation 5.2.9 Time To Compile GCC 4.9.0 20140126 LLVM Clang 3.4 13 26 39 52 65 SE +/- 0.12, N = 3 SE +/- 0.05, N = 3 58.39 33.22 1. (CC) gcc options: -O3 -march=native -pedantic -ldl -lz -lm
x264 H.264 Video Encoding OpenBenchmarking.org Frames Per Second, More Is Better x264 2014-01-09 H.264 Video Encoding GCC 4.9.0 20140126 LLVM Clang 3.4 20 40 60 80 100 SE +/- 0.71, N = 5 SE +/- 0.51, N = 5 82.37 81.91 1. (CC) gcc options: -ldl -m64 -lm -lpthread -O3 -ffast-math -march=native -std=gnu99 -fomit-frame-pointer -fno-tree-vectorize
C-Ray Total Time OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time GCC 4.9.0 20140126 LLVM Clang 3.4 15 30 45 60 75 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 40.50 67.08 1. (CC) gcc options: -lm -lpthread -O3 -march=native
FFmpeg H.264 HD To NTSC DV OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 2.1.1 H.264 HD To NTSC DV GCC 4.9.0 20140126 LLVM Clang 3.4 5 10 15 20 25 SE +/- 0.14, N = 3 SE +/- 0.03, N = 3 21.37 21.41 -fno-tree-vectorize -MF -MT -Qunused-arguments 1. (CC) gcc options: -lavdevice -lavfilter -lavformat -lavcodec -lswresample -lswscale -lavutil -ldl -lasound -lSDL -lm -pthread -O3 -march=native -std=c99 -fomit-frame-pointer -fno-math-errno -fno-signed-zeros -MMD
Apache Benchmark Static Web Page Serving OpenBenchmarking.org Requests Per Second, More Is Better Apache Benchmark 2.4.7 Static Web Page Serving GCC 4.9.0 20140126 LLVM Clang 3.4 4K 8K 12K 16K 20K SE +/- 172.83, N = 3 SE +/- 207.92, N = 3 17507.22 17173.10 1. (CC) gcc options: -shared -fPIC -pthread -O3 -march=native
Fhourstones Complex Connect-4 Solving OpenBenchmarking.org Kpos / sec, More Is Better Fhourstones 3.1 Complex Connect-4 Solving GCC 4.9.0 20140126 LLVM Clang 3.4 2K 4K 6K 8K 10K SE +/- 15.15, N = 3 SE +/- 3.26, N = 3 9569.50 9557.57 1. (CC) gcc options: -O3
PolyBench-C Test: 3 Matrix Multiplications OpenBenchmarking.org Seconds, Fewer Is Better PolyBench-C 3.2 Test: 3 Matrix Multiplications GCC 4.9.0 20140126 LLVM Clang 3.4 30 60 90 120 150 SE +/- 0.07, N = 3 SE +/- 0.18, N = 3 126.07 122.80 1. (CC) gcc options: -O3 -march=native
Phoronix Test Suite v10.8.4