NVIDIA Tegra K1 GCC Clang Compilers GCC 4.8.2 vs. Clang 3.4 on NVIDIA's Tegra K1 SoC via the NVIDIA Jetson TK1 ARM quad-core Cortex-A15 development board. Benchmarks by Michael Larabel for a future article on Phoronix.com.
HTML result view exported from: https://openbenchmarking.org/result/1405138-KH-NVIDIATEG53 .
NVIDIA Tegra K1 GCC Clang Compilers Processor Motherboard Memory Disk Graphics Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution Clang 3.4 GCC 4.8.2 ARMv7 rev 3 @ 2.32GHz (4 Cores) laguna 2048MB 16GB SEM16G GK20A/AXI Realtek RTL8111/8168/8411 Ubuntu 14.04 3.10.24-g6a2d13a (armv7l) Unity 7.2.0 X Server 1.15.1 NVIDIA 19.2 4.3.0 Clang 3.4-1ubuntu3 ext4 1920x1080 GCC 4.8.2 + Clang 3.4-1ubuntu3 OpenBenchmarking.org Processor Details - Scaling Governor: tegra ondemand Compiler Details - GCC 4.8.2: --build=arm-linux-gnueabihf --disable-browser-plugin --disable-libitm --disable-libmudflap --disable-libquadmath --disable-sjlj-exceptions --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf --with-arch-directory=arm --with-arch=armv7-a --with-float=hard --with-fpu=vfpv3-d16 --with-mode=thumb -v
NVIDIA Tegra K1 GCC Clang Compilers scimark2: Composite scimark2: Monte Carlo scimark2: Fast Fourier Transform scimark2: Sparse Matrix Multiply scimark2: Dense LU Matrix Factorization scimark2: Jacobi Successive Over-Relaxation graphics-magick: Blur graphics-magick: Sharpen graphics-magick: Resizing graphics-magick: HWB Color Space graphics-magick: Local Adaptive Thresholding himeno: Poisson Pressure Solver build-apache: Time To Compile build-imagemagick: Time To Compile c-ray: Total Time encode-flac: WAV To FLAC encode-mp3: WAV To MP3 apache: Static Web Page Serving Clang 3.4 GCC 4.8.2 393.13 275.42 58.04 369.93 624.29 637.99 30 13 29 55 30 224.67 83.68 146.80 207.48 17.93 37.99 4565.50 351.71 231.91 45.98 374.74 625.32 480.59 48 46 65 72 36 142.37 128.21 346.30 86.01 15.16 36.63 4601.34 OpenBenchmarking.org
SciMark Computational Test: Composite OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Composite Clang 3.4 GCC 4.8.2 90 180 270 360 450 SE +/- 0.21, N = 4 SE +/- 0.11, N = 4 393.13 351.71
SciMark Computational Test: Monte Carlo OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo Clang 3.4 GCC 4.8.2 60 120 180 240 300 SE +/- 0.07, N = 4 SE +/- 0.01, N = 4 275.42 231.91
SciMark Computational Test: Fast Fourier Transform OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform Clang 3.4 GCC 4.8.2 13 26 39 52 65 SE +/- 0.04, N = 4 SE +/- 0.28, N = 4 58.04 45.98
SciMark Computational Test: Sparse Matrix Multiply OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply Clang 3.4 GCC 4.8.2 80 160 240 320 400 SE +/- 0.40, N = 4 SE +/- 0.22, N = 4 369.93 374.74
SciMark Computational Test: Dense LU Matrix Factorization OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization Clang 3.4 GCC 4.8.2 140 280 420 560 700 SE +/- 0.31, N = 4 SE +/- 0.40, N = 4 624.29 625.32
SciMark Computational Test: Jacobi Successive Over-Relaxation OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation Clang 3.4 GCC 4.8.2 140 280 420 560 700 SE +/- 0.45, N = 4 SE +/- 0.03, N = 4 637.99 480.59
GraphicsMagick Operation: Blur OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Blur Clang 3.4 GCC 4.8.2 11 22 33 44 55 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 30 48 -std=gnu99 -fopenmp -lgomp 1. (CC) gcc options: -O2 -pthread -ljbig -lwebp -ljpeg -lXext -lSM -lICE -lX11 -llzma -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Sharpen Clang 3.4 GCC 4.8.2 10 20 30 40 50 SE +/- 0.21, N = 6 SE +/- 0.00, N = 3 13 46 -std=gnu99 -fopenmp -lgomp 1. (CC) gcc options: -O2 -pthread -ljbig -lwebp -ljpeg -lXext -lSM -lICE -lX11 -llzma -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: Resizing OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Resizing Clang 3.4 GCC 4.8.2 15 30 45 60 75 SE +/- 0.00, N = 3 SE +/- 0.33, N = 3 29 65 -std=gnu99 -fopenmp -lgomp 1. (CC) gcc options: -O2 -pthread -ljbig -lwebp -ljpeg -lXext -lSM -lICE -lX11 -llzma -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: HWB Color Space OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: HWB Color Space Clang 3.4 GCC 4.8.2 16 32 48 64 80 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 55 72 -std=gnu99 -fopenmp -lgomp 1. (CC) gcc options: -O2 -pthread -ljbig -lwebp -ljpeg -lXext -lSM -lICE -lX11 -llzma -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: Local Adaptive Thresholding OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Local Adaptive Thresholding Clang 3.4 GCC 4.8.2 8 16 24 32 40 SE +/- 0.00, N = 3 SE +/- 0.33, N = 3 30 36 -std=gnu99 -fopenmp -lgomp 1. (CC) gcc options: -O2 -pthread -ljbig -lwebp -ljpeg -lXext -lSM -lICE -lX11 -llzma -lxml2 -lz -lm -lpthread
Himeno Benchmark Poisson Pressure Solver OpenBenchmarking.org MFLOPS, More Is Better Himeno Benchmark 3.0 Poisson Pressure Solver Clang 3.4 GCC 4.8.2 50 100 150 200 250 SE +/- 3.56, N = 3 SE +/- 2.19, N = 5 224.67 142.37 1. (CC) gcc options: -O3
Timed Apache Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Apache Compilation 2.4.7 Time To Compile Clang 3.4 GCC 4.8.2 30 60 90 120 150 SE +/- 0.45, N = 3 SE +/- 0.68, N = 3 83.68 128.21
Timed ImageMagick Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed ImageMagick Compilation 6.8.1-10 Time To Compile Clang 3.4 GCC 4.8.2 80 160 240 320 400 SE +/- 0.62, N = 3 SE +/- 0.20, N = 3 146.80 346.30
C-Ray Total Time OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time Clang 3.4 GCC 4.8.2 50 100 150 200 250 SE +/- 1.04, N = 3 SE +/- 0.18, N = 3 207.48 86.01 1. (CC) gcc options: -lm -lpthread -O3
FLAC Audio Encoding WAV To FLAC OpenBenchmarking.org Seconds, Fewer Is Better FLAC Audio Encoding 1.3.0 WAV To FLAC Clang 3.4 GCC 4.8.2 4 8 12 16 20 SE +/- 0.15, N = 5 SE +/- 0.12, N = 5 17.93 15.16 1. (CXX) g++ options: -O2 -fvisibility=hidden -logg -lm
LAME MP3 Encoding WAV To MP3 OpenBenchmarking.org Seconds, Fewer Is Better LAME MP3 Encoding 3.99.3 WAV To MP3 Clang 3.4 GCC 4.8.2 9 18 27 36 45 SE +/- 0.16, N = 5 SE +/- 0.14, N = 5 37.99 36.63 -funroll-loops -fomit-frame-pointer 1. (CC) gcc options: -O3 -ffast-math -pipe -lm
Apache Benchmark Static Web Page Serving OpenBenchmarking.org Requests Per Second, More Is Better Apache Benchmark 2.4.7 Static Web Page Serving Clang 3.4 GCC 4.8.2 1000 2000 3000 4000 5000 SE +/- 23.54, N = 3 SE +/- 45.72, N = 3 4565.50 4601.34 1. (CC) gcc options: -shared -fPIC -O2 -pthread
Phoronix Test Suite v10.8.4