GCC Jetson TX1 GCC 4.8 4.9 5.2 Benchmarking Cortex A57 rev 1 testing NVIDIA Jetson Tegra X1 compiler benchmarking. GCC 4.8 through GCC 5.2.1 stable. Benchmarks by Michael Larabel for a future article on Phoronix.com.
HTML result view exported from: https://openbenchmarking.org/result/1511182-HA-GCCJETSON86&gru&sor .
GCC Jetson TX1 GCC 4.8 4.9 5.2 Benchmarking Processor Motherboard Memory Disk Graphics OS Kernel Desktop Display Server Display Driver Compiler File-System Screen Resolution GCC 4.8.4 - Stock GCC 4.9.3 GCC 5.2.1 Cortex A57 rev 1 @ 1.91GHz (4 Cores) jetson_tx1 4096MB 16GB 016G32 NVIDIA TEGRA Ubuntu 14.04 3.10.67-g3a5c467 (aarch64) Unity 7.2.2 X Server 1.15.1 NVIDIA 1.0.0 GCC 4.8.4 + CUDA 7.0 ext4 3840x2160 GCC 4.9.3 + CUDA 7.0 GCC 5.2.1 20151031 + CUDA 7.0 OpenBenchmarking.org Compiler Details - GCC 4.8.4 - Stock: --build=arm-linux-gnueabihf --disable-browser-plugin --disable-libitm --disable-libmudflap --disable-libquadmath --disable-sjlj-exceptions --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf --with-arch-directory=arm --with-arch=armv7-a --with-float=hard --with-fpu=vfpv3-d16 --with-mode=thumb -v - GCC 4.9.3: --build=arm-linux-gnueabihf --disable-browser-plugin --disable-libitm --disable-libquadmath --disable-sjlj-exceptions --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-multilib --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf --with-arch-directory=arm --with-arch=armv7-a --with-float=hard --with-fpu=vfpv3-d16 --with-mode=thumb -v - GCC 5.2.1: --build=arm-linux-gnueabihf --disable-browser-plugin --disable-libitm --disable-libquadmath --disable-libstdcxx-dual-abi --disable-sjlj-exceptions --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-multilib --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf --with-arch-directory=arm --with-arch=armv7-a --with-default-libstdcxx-abi=gcc4-compatible --with-float=hard --with-fpu=vfpv3-d16 --with-mode=thumb -v Processor Details - Scaling Governor: tegra interactive
GCC Jetson TX1 GCC 4.8 4.9 5.2 Benchmarking vpxenc: vpxenc ffte: N=64, 1D Complex FFT Routine fftw: Stock - 2D FFT Size 1024 scimark2: Composite scimark2: Monte Carlo scimark2: Fast Fourier Transform scimark2: Sparse Matrix Multiply scimark2: Dense LU Matrix Factorization scimark2: Jacobi Successive Over-Relaxation compress-7zip: Compress Speed Test john-the-ripper: MD5 npb: BT.A npb: EP.B npb: LU.A npb: SP.A dolfyn: Computational Fluid Dynamics c-ray: Total Time compress-pbzip2: 256MB File Compression smallpt: Global Illumination Renderer; 100 Samples n-queens: Elapsed Time GCC 4.8.4 - Stock GCC 4.9.3 GCC 5.2.1 9.20 2005.31 189.48 367.45 187.50 37.94 463.17 700.46 448.16 4294 32229 3769.58 49.85 2191.56 1626.27 108.79 93.58 35.38 614 119.50 7.82 2041.50 353.92 363.59 186.75 63.60 458.87 680.43 428.29 4171 30269 3577.54 49.85 2018.76 1542.82 109.68 101.35 32.96 643 122.10 7.27 2040.03 306.19 354.41 188.55 57.86 443.31 654.41 427.92 3960 21894 2781.28 42.44 1196.56 1215.11 113.06 96.67 38.09 718 132.34 OpenBenchmarking.org
VP8 libvpx Encoding vpxenc OpenBenchmarking.org Frames Per Second, More Is Better VP8 libvpx Encoding 1.3.0 vpxenc GCC 4.8.4 - Stock GCC 4.9.3 GCC 5.2.1 3 6 9 12 15 SE +/- 0.29, N = 6 SE +/- 0.33, N = 6 SE +/- 0.24, N = 6 9.20 7.82 7.27 1. (CXX) g++ options: -lvpx -lgtest -lpthread -lm -O3
FFTE Test: N=64, 1D Complex FFT Routine OpenBenchmarking.org MFLOPS, More Is Better FFTE 5.0 Test: N=64, 1D Complex FFT Routine GCC 4.9.3 GCC 5.2.1 GCC 4.8.4 - Stock 400 800 1200 1600 2000 SE +/- 0.36, N = 3 SE +/- 0.33, N = 3 SE +/- 35.22, N = 6 2041.50 2040.03 2005.31 1. (F9X) gfortran options: -O3 -fomit-frame-pointer -fopenmp -pthread -lmpi_f90 -lmpi_f77 -lmpi -ldl -lhwloc
FFTW Build: Stock - Size: 2D FFT Size 1024 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.4 Build: Stock - Size: 2D FFT Size 1024 GCC 4.9.3 GCC 5.2.1 GCC 4.8.4 - Stock 80 160 240 320 400 SE +/- 3.00, N = 5 SE +/- 1.06, N = 5 SE +/- 2.04, N = 5 353.92 306.19 189.48 -std=gnu99 -std=gnu99 1. (CC) gcc options: -O3 -fomit-frame-pointer -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
SciMark Computational Test: Composite OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Composite GCC 4.8.4 - Stock GCC 4.9.3 GCC 5.2.1 80 160 240 320 400 SE +/- 0.28, N = 4 SE +/- 0.59, N = 4 SE +/- 0.87, N = 4 367.45 363.59 354.41
SciMark Computational Test: Monte Carlo OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo GCC 5.2.1 GCC 4.8.4 - Stock GCC 4.9.3 40 80 120 160 200 SE +/- 0.91, N = 4 SE +/- 0.82, N = 4 SE +/- 0.97, N = 4 188.55 187.50 186.75
SciMark Computational Test: Fast Fourier Transform OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform GCC 4.9.3 GCC 5.2.1 GCC 4.8.4 - Stock 14 28 42 56 70 SE +/- 0.39, N = 4 SE +/- 2.03, N = 4 SE +/- 0.22, N = 4 63.60 57.86 37.94
SciMark Computational Test: Sparse Matrix Multiply OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply GCC 4.8.4 - Stock GCC 4.9.3 GCC 5.2.1 100 200 300 400 500 SE +/- 0.51, N = 4 SE +/- 0.76, N = 4 SE +/- 1.66, N = 4 463.17 458.87 443.31
SciMark Computational Test: Dense LU Matrix Factorization OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization GCC 4.8.4 - Stock GCC 4.9.3 GCC 5.2.1 150 300 450 600 750 SE +/- 0.61, N = 4 SE +/- 1.92, N = 4 SE +/- 2.84, N = 4 700.46 680.43 654.41
SciMark Computational Test: Jacobi Successive Over-Relaxation OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation GCC 4.8.4 - Stock GCC 4.9.3 GCC 5.2.1 100 200 300 400 500 SE +/- 0.07, N = 4 SE +/- 0.15, N = 4 SE +/- 0.20, N = 4 448.16 428.29 427.92
7-Zip Compression Compress Speed Test OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 9.20.1 Compress Speed Test GCC 4.8.4 - Stock GCC 4.9.3 GCC 5.2.1 900 1800 2700 3600 4500 SE +/- 19.35, N = 3 SE +/- 5.49, N = 3 SE +/- 8.25, N = 3 4294 4171 3960 1. (CXX) g++ options: -pipe -lpthread
John The Ripper Test: MD5 OpenBenchmarking.org Real C/S, More Is Better John The Ripper 1.8.0 Test: MD5 GCC 4.8.4 - Stock GCC 4.9.3 GCC 5.2.1 7K 14K 21K 28K 35K SE +/- 0.00, N = 3 SE +/- 101.26, N = 3 SE +/- 114.62, N = 3 32229 30269 21894 1. (CC) gcc options: -fopenmp
NAS Parallel Benchmarks Test / Class: BT.A OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: BT.A GCC 4.8.4 - Stock GCC 4.9.3 GCC 5.2.1 800 1600 2400 3200 4000 SE +/- 8.11, N = 3 SE +/- 4.13, N = 3 SE +/- 7.85, N = 3 3769.58 3577.54 2781.28 1. (F9X) gfortran options: -fopenmp
NAS Parallel Benchmarks Test / Class: EP.B OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: EP.B GCC 4.9.3 GCC 4.8.4 - Stock GCC 5.2.1 11 22 33 44 55 SE +/- 0.07, N = 3 SE +/- 0.85, N = 3 SE +/- 0.57, N = 3 49.85 49.85 42.44 1. (F9X) gfortran options: -fopenmp
NAS Parallel Benchmarks Test / Class: LU.A OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: LU.A GCC 4.8.4 - Stock GCC 4.9.3 GCC 5.2.1 500 1000 1500 2000 2500 SE +/- 4.35, N = 3 SE +/- 5.25, N = 3 SE +/- 12.89, N = 3 2191.56 2018.76 1196.56 1. (F9X) gfortran options: -fopenmp
NAS Parallel Benchmarks Test / Class: SP.A OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: SP.A GCC 4.8.4 - Stock GCC 4.9.3 GCC 5.2.1 300 600 900 1200 1500 SE +/- 3.63, N = 3 SE +/- 1.48, N = 3 SE +/- 4.54, N = 3 1626.27 1542.82 1215.11 1. (F9X) gfortran options: -fopenmp
Dolfyn Computational Fluid Dynamics OpenBenchmarking.org Seconds, Fewer Is Better Dolfyn 0.527 Computational Fluid Dynamics GCC 4.8.4 - Stock GCC 4.9.3 GCC 5.2.1 30 60 90 120 150 SE +/- 0.28, N = 3 SE +/- 0.01, N = 3 SE +/- 0.24, N = 3 108.79 109.68 113.06
C-Ray Total Time OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time GCC 4.8.4 - Stock GCC 5.2.1 GCC 4.9.3 20 40 60 80 100 SE +/- 6.00, N = 6 SE +/- 4.92, N = 6 SE +/- 7.19, N = 6 93.58 96.67 101.35 1. (CC) gcc options: -lm -lpthread -O3
Parallel BZIP2 Compression 256MB File Compression OpenBenchmarking.org Seconds, Fewer Is Better Parallel BZIP2 Compression 1.1.6 256MB File Compression GCC 4.9.3 GCC 4.8.4 - Stock GCC 5.2.1 9 18 27 36 45 SE +/- 0.18, N = 3 SE +/- 0.11, N = 3 SE +/- 0.17, N = 3 32.96 35.38 38.09 1. (CXX) g++ options: -O2 -pthread -lbz2 -lpthread
Smallpt Global Illumination Renderer; 100 Samples OpenBenchmarking.org Seconds, Fewer Is Better Smallpt 1.0 Global Illumination Renderer; 100 Samples GCC 4.8.4 - Stock GCC 4.9.3 GCC 5.2.1 150 300 450 600 750 SE +/- 1.33, N = 3 SE +/- 1.15, N = 3 SE +/- 0.88, N = 3 614 643 718 1. (CXX) g++ options: -fopenmp
N-Queens Elapsed Time OpenBenchmarking.org Seconds, Fewer Is Better N-Queens 1.0 Elapsed Time GCC 4.8.4 - Stock GCC 4.9.3 GCC 5.2.1 30 60 90 120 150 SE +/- 0.06, N = 3 SE +/- 0.09, N = 3 SE +/- 0.31, N = 3 119.50 122.10 132.34 1. (CC) gcc options: -static -fopenmp -O3
Phoronix Test Suite v10.8.4