GCC Jetson TX1 GCC 4.8 4.9 5.2 Benchmarking Cortex A57 rev 1 testing NVIDIA Jetson Tegra X1 compiler benchmarking. GCC 4.8 through GCC 5.2.1 stable. Benchmarks by Michael Larabel for a future article on Phoronix.com.
HTML result view exported from: https://openbenchmarking.org/result/1511182-HA-GCCJETSON86&rdt&grt .
GCC Jetson TX1 GCC 4.8 4.9 5.2 Benchmarking Processor Motherboard Memory Disk Graphics OS Kernel Desktop Display Server Display Driver Compiler File-System Screen Resolution GCC 4.8.4 - Stock GCC 5.2.1 GCC 4.9.3 Cortex A57 rev 1 @ 1.91GHz (4 Cores) jetson_tx1 4096MB 16GB 016G32 NVIDIA TEGRA Ubuntu 14.04 3.10.67-g3a5c467 (aarch64) Unity 7.2.2 X Server 1.15.1 NVIDIA 1.0.0 GCC 4.8.4 + CUDA 7.0 ext4 3840x2160 GCC 5.2.1 20151031 + CUDA 7.0 GCC 4.9.3 + CUDA 7.0 OpenBenchmarking.org Compiler Details - GCC 4.8.4 - Stock: --build=arm-linux-gnueabihf --disable-browser-plugin --disable-libitm --disable-libmudflap --disable-libquadmath --disable-sjlj-exceptions --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf --with-arch-directory=arm --with-arch=armv7-a --with-float=hard --with-fpu=vfpv3-d16 --with-mode=thumb -v - GCC 5.2.1: --build=arm-linux-gnueabihf --disable-browser-plugin --disable-libitm --disable-libquadmath --disable-libstdcxx-dual-abi --disable-sjlj-exceptions --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-multilib --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf --with-arch-directory=arm --with-arch=armv7-a --with-default-libstdcxx-abi=gcc4-compatible --with-float=hard --with-fpu=vfpv3-d16 --with-mode=thumb -v - GCC 4.9.3: --build=arm-linux-gnueabihf --disable-browser-plugin --disable-libitm --disable-libquadmath --disable-sjlj-exceptions --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-multilib --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf --with-arch-directory=arm --with-arch=armv7-a --with-float=hard --with-fpu=vfpv3-d16 --with-mode=thumb -v Processor Details - Scaling Governor: tegra interactive
GCC Jetson TX1 GCC 4.8 4.9 5.2 Benchmarking compress-7zip: Compress Speed Test c-ray: Total Time dolfyn: Computational Fluid Dynamics ffte: N=64, 1D Complex FFT Routine fftw: Stock - 2D FFT Size 1024 john-the-ripper: MD5 n-queens: Elapsed Time npb: BT.A npb: EP.B npb: LU.A npb: SP.A compress-pbzip2: 256MB File Compression scimark2: Composite scimark2: Monte Carlo scimark2: Fast Fourier Transform scimark2: Sparse Matrix Multiply scimark2: Dense LU Matrix Factorization scimark2: Jacobi Successive Over-Relaxation smallpt: Global Illumination Renderer; 100 Samples vpxenc: vpxenc GCC 4.8.4 - Stock GCC 5.2.1 GCC 4.9.3 4294 93.58 108.79 2005.31 189.48 32229 119.50 3769.58 49.85 2191.56 1626.27 35.38 367.45 187.50 37.94 463.17 700.46 448.16 614 9.20 3960 96.67 113.06 2040.03 306.19 21894 132.34 2781.28 42.44 1196.56 1215.11 38.09 354.41 188.55 57.86 443.31 654.41 427.92 718 7.27 4171 101.35 109.68 2041.50 353.92 30269 122.10 3577.54 49.85 2018.76 1542.82 32.96 363.59 186.75 63.60 458.87 680.43 428.29 643 7.82 OpenBenchmarking.org
7-Zip Compression Compress Speed Test OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 9.20.1 Compress Speed Test GCC 4.8.4 - Stock GCC 5.2.1 GCC 4.9.3 900 1800 2700 3600 4500 SE +/- 19.35, N = 3 SE +/- 8.25, N = 3 SE +/- 5.49, N = 3 4294 3960 4171 1. (CXX) g++ options: -pipe -lpthread
C-Ray Total Time OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time GCC 4.8.4 - Stock GCC 5.2.1 GCC 4.9.3 20 40 60 80 100 SE +/- 6.00, N = 6 SE +/- 4.92, N = 6 SE +/- 7.19, N = 6 93.58 96.67 101.35 1. (CC) gcc options: -lm -lpthread -O3
Dolfyn Computational Fluid Dynamics OpenBenchmarking.org Seconds, Fewer Is Better Dolfyn 0.527 Computational Fluid Dynamics GCC 4.8.4 - Stock GCC 5.2.1 GCC 4.9.3 30 60 90 120 150 SE +/- 0.28, N = 3 SE +/- 0.24, N = 3 SE +/- 0.01, N = 3 108.79 113.06 109.68
FFTE Test: N=64, 1D Complex FFT Routine OpenBenchmarking.org MFLOPS, More Is Better FFTE 5.0 Test: N=64, 1D Complex FFT Routine GCC 4.8.4 - Stock GCC 5.2.1 GCC 4.9.3 400 800 1200 1600 2000 SE +/- 35.22, N = 6 SE +/- 0.33, N = 3 SE +/- 0.36, N = 3 2005.31 2040.03 2041.50 1. (F9X) gfortran options: -O3 -fomit-frame-pointer -fopenmp -pthread -lmpi_f90 -lmpi_f77 -lmpi -ldl -lhwloc
FFTW Build: Stock - Size: 2D FFT Size 1024 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.4 Build: Stock - Size: 2D FFT Size 1024 GCC 4.8.4 - Stock GCC 5.2.1 GCC 4.9.3 80 160 240 320 400 SE +/- 2.04, N = 5 SE +/- 1.06, N = 5 SE +/- 3.00, N = 5 189.48 306.19 353.92 -std=gnu99 -std=gnu99 1. (CC) gcc options: -O3 -fomit-frame-pointer -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
John The Ripper Test: MD5 OpenBenchmarking.org Real C/S, More Is Better John The Ripper 1.8.0 Test: MD5 GCC 4.8.4 - Stock GCC 5.2.1 GCC 4.9.3 7K 14K 21K 28K 35K SE +/- 0.00, N = 3 SE +/- 114.62, N = 3 SE +/- 101.26, N = 3 32229 21894 30269 1. (CC) gcc options: -fopenmp
N-Queens Elapsed Time OpenBenchmarking.org Seconds, Fewer Is Better N-Queens 1.0 Elapsed Time GCC 4.8.4 - Stock GCC 5.2.1 GCC 4.9.3 30 60 90 120 150 SE +/- 0.06, N = 3 SE +/- 0.31, N = 3 SE +/- 0.09, N = 3 119.50 132.34 122.10 1. (CC) gcc options: -static -fopenmp -O3
NAS Parallel Benchmarks Test / Class: BT.A OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: BT.A GCC 4.8.4 - Stock GCC 5.2.1 GCC 4.9.3 800 1600 2400 3200 4000 SE +/- 8.11, N = 3 SE +/- 7.85, N = 3 SE +/- 4.13, N = 3 3769.58 2781.28 3577.54 1. (F9X) gfortran options: -fopenmp
NAS Parallel Benchmarks Test / Class: EP.B OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: EP.B GCC 4.8.4 - Stock GCC 5.2.1 GCC 4.9.3 11 22 33 44 55 SE +/- 0.85, N = 3 SE +/- 0.57, N = 3 SE +/- 0.07, N = 3 49.85 42.44 49.85 1. (F9X) gfortran options: -fopenmp
NAS Parallel Benchmarks Test / Class: LU.A OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: LU.A GCC 4.8.4 - Stock GCC 5.2.1 GCC 4.9.3 500 1000 1500 2000 2500 SE +/- 4.35, N = 3 SE +/- 12.89, N = 3 SE +/- 5.25, N = 3 2191.56 1196.56 2018.76 1. (F9X) gfortran options: -fopenmp
NAS Parallel Benchmarks Test / Class: SP.A OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: SP.A GCC 4.8.4 - Stock GCC 5.2.1 GCC 4.9.3 300 600 900 1200 1500 SE +/- 3.63, N = 3 SE +/- 4.54, N = 3 SE +/- 1.48, N = 3 1626.27 1215.11 1542.82 1. (F9X) gfortran options: -fopenmp
Parallel BZIP2 Compression 256MB File Compression OpenBenchmarking.org Seconds, Fewer Is Better Parallel BZIP2 Compression 1.1.6 256MB File Compression GCC 4.8.4 - Stock GCC 5.2.1 GCC 4.9.3 9 18 27 36 45 SE +/- 0.11, N = 3 SE +/- 0.17, N = 3 SE +/- 0.18, N = 3 35.38 38.09 32.96 1. (CXX) g++ options: -O2 -pthread -lbz2 -lpthread
SciMark Computational Test: Composite OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Composite GCC 4.8.4 - Stock GCC 5.2.1 GCC 4.9.3 80 160 240 320 400 SE +/- 0.28, N = 4 SE +/- 0.87, N = 4 SE +/- 0.59, N = 4 367.45 354.41 363.59
SciMark Computational Test: Monte Carlo OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo GCC 4.8.4 - Stock GCC 5.2.1 GCC 4.9.3 40 80 120 160 200 SE +/- 0.82, N = 4 SE +/- 0.91, N = 4 SE +/- 0.97, N = 4 187.50 188.55 186.75
SciMark Computational Test: Fast Fourier Transform OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform GCC 4.8.4 - Stock GCC 5.2.1 GCC 4.9.3 14 28 42 56 70 SE +/- 0.22, N = 4 SE +/- 2.03, N = 4 SE +/- 0.39, N = 4 37.94 57.86 63.60
SciMark Computational Test: Sparse Matrix Multiply OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply GCC 4.8.4 - Stock GCC 5.2.1 GCC 4.9.3 100 200 300 400 500 SE +/- 0.51, N = 4 SE +/- 1.66, N = 4 SE +/- 0.76, N = 4 463.17 443.31 458.87
SciMark Computational Test: Dense LU Matrix Factorization OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization GCC 4.8.4 - Stock GCC 5.2.1 GCC 4.9.3 150 300 450 600 750 SE +/- 0.61, N = 4 SE +/- 2.84, N = 4 SE +/- 1.92, N = 4 700.46 654.41 680.43
SciMark Computational Test: Jacobi Successive Over-Relaxation OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation GCC 4.8.4 - Stock GCC 5.2.1 GCC 4.9.3 100 200 300 400 500 SE +/- 0.07, N = 4 SE +/- 0.20, N = 4 SE +/- 0.15, N = 4 448.16 427.92 428.29
Smallpt Global Illumination Renderer; 100 Samples OpenBenchmarking.org Seconds, Fewer Is Better Smallpt 1.0 Global Illumination Renderer; 100 Samples GCC 4.8.4 - Stock GCC 5.2.1 GCC 4.9.3 150 300 450 600 750 SE +/- 1.33, N = 3 SE +/- 0.88, N = 3 SE +/- 1.15, N = 3 614 718 643 1. (CXX) g++ options: -fopenmp
VP8 libvpx Encoding vpxenc OpenBenchmarking.org Frames Per Second, More Is Better VP8 libvpx Encoding 1.3.0 vpxenc GCC 4.8.4 - Stock GCC 5.2.1 GCC 4.9.3 3 6 9 12 15 SE +/- 0.29, N = 6 SE +/- 0.24, N = 6 SE +/- 0.33, N = 6 9.20 7.27 7.82 1. (CXX) g++ options: -lvpx -lgtest -lpthread -lm -O3
Phoronix Test Suite v10.8.4