LLVM Clang 3.4 AMD APU Benchmarks Benchmarks by Michael Larabel for a future article on Phoronix.com. Quick look at LLVM Clang 3.3 vs. Clang 3.4 compiler performance. More tests forthcoming.
HTML result view exported from: https://openbenchmarking.org/result/1411077-SO-1312033SO50&sor&export=pdf&grr .
LLVM Clang 3.4 AMD APU Benchmarks Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution Clang 3.3 Clang 3.5 SVN i7 4700HQ AMD A10-6800K APU @ 4.70GHz (4 Cores) MSI FM2-A85XA-G65 (MS-7793) v1.0 AMD Family 15h 8192MB 64GB OCZ AGILITY Sapphire AMD Radeon HD 6950 2048MB Realtek ALC892 SyncMaster Realtek RTL8111/8168/8411 Ubuntu 13.10 3.13.0-999-generic (x86_64) Unity 7.1.2 X Server 1.14.3 radeon 7.2.99 3.1 Mesa 10.1.0-devel (git-5b331f6 saucy-oibaf-ppa) Gallium 0.4 Clang 3.3-5ubuntu4 ext4 2560x1600 Clang 3.5-1~exp1 Intel Core i7-4700HQ @ 2.40GHz (8 Cores) ASUS G750JM v1.0 Intel Xeon E3-1200 v3/4th 31744MB 1000GB Seagate ST1000LM014-1EJ1 + 1000GB TOSHIBA MQ01ABD1 + 1500GB HGST HTS541515A9 ASUS NVIDIA GeForce GTX 860M 2048MB (540/2505MHz) Intel Haswell HDMI Qualcomm Atheros QCA8171 Gigabit + Broadcom BCM4352 802.11ac Wireless Ubuntu 14.04 3.13.0-39-generic (x86_64) Unity 7.2.3 X Server 1.15.1 4.3.0 GCC 4.8 + Clang 3.4-1ubuntu3 + LLVM 3.4 + CUDA 6.5 1920x1080 OpenBenchmarking.org Compiler Details - i7 4700HQ: --build=x86_64-linux-gnu --disable-browser-plugin --disable-libmudflap --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-4.8-amd64/jre --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-4.8-amd64 --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-4.8-amd64 --with-multilib-list=m32,m64,mx32 --with-tune=generic -v Processor Details - i7 4700HQ: Scaling Governor: acpi-cpufreq ondemand
LLVM Clang 3.4 AMD APU Benchmarks encode-mp3: WAV To MP3 encode-flac: WAV To FLAC smallpt: Global Illumination Renderer; 100 Samples c-ray: Total Time himeno: Poisson Pressure Solver scimark2: Jacobi Successive Over-Relaxation scimark2: Dense LU Matrix Factorization scimark2: Sparse Matrix Multiply scimark2: Fast Fourier Transform scimark2: Monte Carlo scimark2: Composite botan: X9.19-MAC botan: CAST-256 botan: Twofish botan: AES-256 botan: KASUMI botan: Tiger blake2: Phoronix Test Suite v4.8.5 Clang 3.3 Clang 3.5 SVN i7 4700HQ 16.36 7.09 211 63.66 770.32 1368.69 1198.51 1016.39 74.16 493.11 830.17 75.17 115.62 217.19 180.17 79.39 361.16 9.66 18.76 6.31 220 62.34 811.80 1368.69 1325.06 1015.77 69.01 474.59 850.62 75.86 112.45 218.46 178.18 63.17 360.78 9.11 14.60 5.42 126 27.36 1480.80 996.27 2347.95 1902.11 255.92 525.11 1204.07 74.95 81.16 178.01 133.77 65.29 361.13 4.16 OpenBenchmarking.org
LAME MP3 Encoding WAV To MP3 OpenBenchmarking.org Seconds, Fewer Is Better LAME MP3 Encoding 3.99.3 WAV To MP3 i7 4700HQ Clang 3.3 Clang 3.5 SVN 5 10 15 20 25 SE +/- 0.09, N = 5 SE +/- 0.00, N = 5 SE +/- 0.01, N = 5 14.60 16.36 18.76 -fomit-frame-pointer -ffast-math -march=native -march=native 1. (CC) gcc options: -O3 -pipe -lm
FLAC Audio Encoding WAV To FLAC OpenBenchmarking.org Seconds, Fewer Is Better FLAC Audio Encoding 1.3.0 WAV To FLAC i7 4700HQ Clang 3.5 SVN Clang 3.3 2 4 6 8 10 SE +/- 0.06, N = 9 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 5.42 6.31 7.09 -O2 -O3 -march=native -O3 -march=native 1. (CXX) g++ options: -fvisibility=hidden -logg -lm
Smallpt Global Illumination Renderer; 100 Samples OpenBenchmarking.org Seconds, Fewer Is Better Smallpt 1.0 Global Illumination Renderer; 100 Samples i7 4700HQ Clang 3.3 Clang 3.5 SVN 50 100 150 200 250 SE +/- 1.93, N = 6 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 126 211 220 -O3 -march=native -O3 -march=native 1. (CXX) g++ options: -fopenmp
C-Ray Total Time OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time i7 4700HQ Clang 3.5 SVN Clang 3.3 14 28 42 56 70 SE +/- 0.16, N = 3 SE +/- 1.12, N = 6 SE +/- 1.09, N = 6 27.36 62.34 63.66 -march=native -march=native 1. (CC) gcc options: -lm -lpthread -O3
Himeno Benchmark Poisson Pressure Solver OpenBenchmarking.org MFLOPS, More Is Better Himeno Benchmark 3.0 Poisson Pressure Solver i7 4700HQ Clang 3.5 SVN Clang 3.3 300 600 900 1200 1500 SE +/- 15.25, N = 3 SE +/- 2.25, N = 3 SE +/- 8.56, N = 3 1480.80 811.80 770.32 -march=native -march=native 1. (CC) gcc options: -O3
SciMark Computational Test: Jacobi Successive Over-Relaxation OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation Clang 3.5 SVN Clang 3.3 i7 4700HQ 300 600 900 1200 1500 SE +/- 0.00, N = 4 SE +/- 0.00, N = 4 SE +/- 2.16, N = 4 1368.69 1368.69 996.27 -O3 -march=native -O3 -march=native 1. (CXX) g++ options:
SciMark Computational Test: Dense LU Matrix Factorization OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization i7 4700HQ Clang 3.5 SVN Clang 3.3 500 1000 1500 2000 2500 SE +/- 10.10, N = 4 SE +/- 1.64, N = 4 SE +/- 1.56, N = 4 2347.95 1325.06 1198.51 -O3 -march=native -O3 -march=native 1. (CXX) g++ options:
SciMark Computational Test: Sparse Matrix Multiply OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply i7 4700HQ Clang 3.3 Clang 3.5 SVN 400 800 1200 1600 2000 SE +/- 35.22, N = 3 SE +/- 1.46, N = 4 SE +/- 2.79, N = 4 1902.11 1016.39 1015.77 -O3 -march=native -O3 -march=native 1. (CXX) g++ options:
SciMark Computational Test: Fast Fourier Transform OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform i7 4700HQ Clang 3.3 Clang 3.5 SVN 60 120 180 240 300 SE +/- 4.38, N = 4 SE +/- 0.64, N = 4 SE +/- 0.23, N = 2 255.92 74.16 69.01 -O3 -march=native -O3 -march=native 1. (CXX) g++ options:
SciMark Computational Test: Monte Carlo OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo i7 4700HQ Clang 3.3 Clang 3.5 SVN 110 220 330 440 550 SE +/- 0.59, N = 4 SE +/- 0.57, N = 4 SE +/- 0.53, N = 4 525.11 493.11 474.59 -O3 -march=native -O3 -march=native 1. (CXX) g++ options:
SciMark Computational Test: Composite OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Composite i7 4700HQ Clang 3.5 SVN Clang 3.3 300 600 900 1200 1500 SE +/- 7.60, N = 4 SE +/- 0.52, N = 4 SE +/- 0.55, N = 4 1204.07 850.62 830.17 -O3 -march=native -O3 -march=native 1. (CXX) g++ options:
Botan Test: X9.19-MAC OpenBenchmarking.org Mbytes/s, More Is Better Botan 1.10.3 Test: X9.19-MAC Clang 3.5 SVN Clang 3.3 i7 4700HQ 20 40 60 80 100 75.86 75.17 74.95 1. (CXX) g++ options: -m64 -ldl -lpthread -lrt -O2
Botan Test: CAST-256 OpenBenchmarking.org Mbytes/s, More Is Better Botan 1.10.3 Test: CAST-256 Clang 3.3 Clang 3.5 SVN i7 4700HQ 30 60 90 120 150 115.62 112.45 81.16 1. (CXX) g++ options: -m64 -ldl -lpthread -lrt -O2
Botan Test: Twofish OpenBenchmarking.org Mbytes/s, More Is Better Botan 1.10.3 Test: Twofish Clang 3.5 SVN Clang 3.3 i7 4700HQ 50 100 150 200 250 218.46 217.19 178.01 1. (CXX) g++ options: -m64 -ldl -lpthread -lrt -O2
Botan Test: AES-256 OpenBenchmarking.org Mbytes/s, More Is Better Botan 1.10.3 Test: AES-256 Clang 3.3 Clang 3.5 SVN i7 4700HQ 40 80 120 160 200 180.17 178.18 133.77 1. (CXX) g++ options: -m64 -ldl -lpthread -lrt -O2
Botan Test: KASUMI OpenBenchmarking.org Mbytes/s, More Is Better Botan 1.10.3 Test: KASUMI Clang 3.3 i7 4700HQ Clang 3.5 SVN 20 40 60 80 100 79.39 65.29 63.17 1. (CXX) g++ options: -m64 -ldl -lpthread -lrt -O2
Botan Test: Tiger OpenBenchmarking.org Mbytes/s, More Is Better Botan 1.10.3 Test: Tiger Clang 3.3 i7 4700HQ Clang 3.5 SVN 80 160 240 320 400 361.16 361.13 360.78 1. (CXX) g++ options: -m64 -ldl -lpthread -lrt -O2
BLAKE2 Phoronix Test Suite v4.8.5 OpenBenchmarking.org Cycles Per Byte, Fewer Is Better BLAKE2 20121223 Phoronix Test Suite v4.8.5 i7 4700HQ Clang 3.5 SVN Clang 3.3 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 4.16 9.11 9.66 1. (CC) gcc options: -std=gnu99 -O3 -march=native -lcrypto -lz
Phoronix Test Suite v10.8.5