GCC Jetson TX1 GCC 4.8 4.9 5.2 Benchmarking

Cortex A57 rev 1 testing NVIDIA Jetson Tegra X1 compiler benchmarking. GCC 4.8 through GCC 5.2.1 stable. Benchmarks by Michael Larabel for a future article on Phoronix.com.

HTML result view exported from: https://openbenchmarking.org/result/1511182-HA-GCCJETSON86.

GCC Jetson TX1 GCC 4.8 4.9 5.2 BenchmarkingProcessorMotherboardMemoryDiskGraphicsOSKernelDesktopDisplay ServerDisplay DriverCompilerFile-SystemScreen ResolutionGCC 4.8.4 - StockGCC 4.9.3GCC 5.2.1Cortex A57 rev 1 @ 1.91GHz (4 Cores)jetson_tx14096MB16GB 016G32NVIDIA TEGRAUbuntu 14.043.10.67-g3a5c467 (aarch64)Unity 7.2.2X Server 1.15.1NVIDIA 1.0.0GCC 4.8.4 + CUDA 7.0ext43840x2160GCC 4.9.3 + CUDA 7.0GCC 5.2.1 20151031 + CUDA 7.0OpenBenchmarking.orgCompiler Details- GCC 4.8.4 - Stock: --build=arm-linux-gnueabihf --disable-browser-plugin --disable-libitm --disable-libmudflap --disable-libquadmath --disable-sjlj-exceptions --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf --with-arch-directory=arm --with-arch=armv7-a --with-float=hard --with-fpu=vfpv3-d16 --with-mode=thumb -v - GCC 4.9.3: --build=arm-linux-gnueabihf --disable-browser-plugin --disable-libitm --disable-libquadmath --disable-sjlj-exceptions --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-multilib --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf --with-arch-directory=arm --with-arch=armv7-a --with-float=hard --with-fpu=vfpv3-d16 --with-mode=thumb -v - GCC 5.2.1: --build=arm-linux-gnueabihf --disable-browser-plugin --disable-libitm --disable-libquadmath --disable-libstdcxx-dual-abi --disable-sjlj-exceptions --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-multilib --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf --with-arch-directory=arm --with-arch=armv7-a --with-default-libstdcxx-abi=gcc4-compatible --with-float=hard --with-fpu=vfpv3-d16 --with-mode=thumb -v Processor Details- Scaling Governor: tegra interactive

GCC Jetson TX1 GCC 4.8 4.9 5.2 Benchmarkingnpb: BT.Anpb: EP.Bnpb: LU.Anpb: SP.Adolfyn: Computational Fluid Dynamicsffte: N=64, 1D Complex FFT Routinefftw: Stock - 2D FFT Size 1024scimark2: Compositescimark2: Monte Carloscimark2: Fast Fourier Transformscimark2: Sparse Matrix Multiplyscimark2: Dense LU Matrix Factorizationscimark2: Jacobi Successive Over-Relaxationjohn-the-ripper: MD5vpxenc: vpxenccompress-7zip: Compress Speed Testc-ray: Total Timecompress-pbzip2: 256MB File Compressionsmallpt: Global Illumination Renderer; 100 Samplesn-queens: Elapsed TimeGCC 4.8.4 - StockGCC 4.9.3GCC 5.2.13769.5849.852191.561626.27108.792005.31189.48367.45187.5037.94463.17700.46448.16322299.20429493.5835.38614119.503577.5449.852018.761542.82109.682041.50353.92363.59186.7563.60458.87680.43428.29302697.824171101.3532.96643122.102781.2842.441196.561215.11113.062040.03306.19354.41188.5557.86443.31654.41427.92218947.27396096.6738.09718132.34OpenBenchmarking.org

NAS Parallel Benchmarks

Test / Class: BT.A

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.3Test / Class: BT.AGCC 4.8.4 - StockGCC 4.9.3GCC 5.2.18001600240032004000SE +/- 8.11, N = 3SE +/- 4.13, N = 3SE +/- 7.85, N = 33769.583577.542781.281. (F9X) gfortran options: -fopenmp

NAS Parallel Benchmarks

Test / Class: EP.B

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.3Test / Class: EP.BGCC 4.8.4 - StockGCC 4.9.3GCC 5.2.11122334455SE +/- 0.85, N = 3SE +/- 0.07, N = 3SE +/- 0.57, N = 349.8549.8542.441. (F9X) gfortran options: -fopenmp

NAS Parallel Benchmarks

Test / Class: LU.A

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.3Test / Class: LU.AGCC 4.8.4 - StockGCC 4.9.3GCC 5.2.15001000150020002500SE +/- 4.35, N = 3SE +/- 5.25, N = 3SE +/- 12.89, N = 32191.562018.761196.561. (F9X) gfortran options: -fopenmp

NAS Parallel Benchmarks

Test / Class: SP.A

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.3Test / Class: SP.AGCC 4.8.4 - StockGCC 4.9.3GCC 5.2.130060090012001500SE +/- 3.63, N = 3SE +/- 1.48, N = 3SE +/- 4.54, N = 31626.271542.821215.111. (F9X) gfortran options: -fopenmp

Dolfyn

Computational Fluid Dynamics

OpenBenchmarking.orgSeconds, Fewer Is BetterDolfyn 0.527Computational Fluid DynamicsGCC 4.8.4 - StockGCC 4.9.3GCC 5.2.1306090120150SE +/- 0.28, N = 3SE +/- 0.01, N = 3SE +/- 0.24, N = 3108.79109.68113.06

FFTE

Test: N=64, 1D Complex FFT Routine

OpenBenchmarking.orgMFLOPS, More Is BetterFFTE 5.0Test: N=64, 1D Complex FFT RoutineGCC 4.8.4 - StockGCC 4.9.3GCC 5.2.1400800120016002000SE +/- 35.22, N = 6SE +/- 0.36, N = 3SE +/- 0.33, N = 32005.312041.502040.031. (F9X) gfortran options: -O3 -fomit-frame-pointer -fopenmp -pthread -lmpi_f90 -lmpi_f77 -lmpi -ldl -lhwloc

FFTW

Build: Stock - Size: 2D FFT Size 1024

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.4Build: Stock - Size: 2D FFT Size 1024GCC 4.8.4 - StockGCC 4.9.3GCC 5.2.180160240320400SE +/- 2.04, N = 5SE +/- 3.00, N = 5SE +/- 1.06, N = 5189.48353.92306.19-std=gnu99-std=gnu991. (CC) gcc options: -O3 -fomit-frame-pointer -fstrict-aliasing -fno-schedule-insns -ffast-math -lm

SciMark

Computational Test: Composite

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: CompositeGCC 4.8.4 - StockGCC 4.9.3GCC 5.2.180160240320400SE +/- 0.28, N = 4SE +/- 0.59, N = 4SE +/- 0.87, N = 4367.45363.59354.41

SciMark

Computational Test: Monte Carlo

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Monte CarloGCC 4.8.4 - StockGCC 4.9.3GCC 5.2.14080120160200SE +/- 0.82, N = 4SE +/- 0.97, N = 4SE +/- 0.91, N = 4187.50186.75188.55

SciMark

Computational Test: Fast Fourier Transform

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Fast Fourier TransformGCC 4.8.4 - StockGCC 4.9.3GCC 5.2.11428425670SE +/- 0.22, N = 4SE +/- 0.39, N = 4SE +/- 2.03, N = 437.9463.6057.86

SciMark

Computational Test: Sparse Matrix Multiply

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Sparse Matrix MultiplyGCC 4.8.4 - StockGCC 4.9.3GCC 5.2.1100200300400500SE +/- 0.51, N = 4SE +/- 0.76, N = 4SE +/- 1.66, N = 4463.17458.87443.31

SciMark

Computational Test: Dense LU Matrix Factorization

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Dense LU Matrix FactorizationGCC 4.8.4 - StockGCC 4.9.3GCC 5.2.1150300450600750SE +/- 0.61, N = 4SE +/- 1.92, N = 4SE +/- 2.84, N = 4700.46680.43654.41

SciMark

Computational Test: Jacobi Successive Over-Relaxation

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Jacobi Successive Over-RelaxationGCC 4.8.4 - StockGCC 4.9.3GCC 5.2.1100200300400500SE +/- 0.07, N = 4SE +/- 0.15, N = 4SE +/- 0.20, N = 4448.16428.29427.92

John The Ripper

Test: MD5

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 1.8.0Test: MD5GCC 4.8.4 - StockGCC 4.9.3GCC 5.2.17K14K21K28K35KSE +/- 0.00, N = 3SE +/- 101.26, N = 3SE +/- 114.62, N = 33222930269218941. (CC) gcc options: -fopenmp

VP8 libvpx Encoding

vpxenc

OpenBenchmarking.orgFrames Per Second, More Is BetterVP8 libvpx Encoding 1.3.0vpxencGCC 4.8.4 - StockGCC 4.9.3GCC 5.2.13691215SE +/- 0.29, N = 6SE +/- 0.33, N = 6SE +/- 0.24, N = 69.207.827.271. (CXX) g++ options: -lvpx -lgtest -lpthread -lm -O3

7-Zip Compression

Compress Speed Test

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 9.20.1Compress Speed TestGCC 4.8.4 - StockGCC 4.9.3GCC 5.2.19001800270036004500SE +/- 19.35, N = 3SE +/- 5.49, N = 3SE +/- 8.25, N = 34294417139601. (CXX) g++ options: -pipe -lpthread

C-Ray

Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total TimeGCC 4.8.4 - StockGCC 4.9.3GCC 5.2.120406080100SE +/- 6.00, N = 6SE +/- 7.19, N = 6SE +/- 4.92, N = 693.58101.3596.671. (CC) gcc options: -lm -lpthread -O3

Parallel BZIP2 Compression

256MB File Compression

OpenBenchmarking.orgSeconds, Fewer Is BetterParallel BZIP2 Compression 1.1.6256MB File CompressionGCC 4.8.4 - StockGCC 4.9.3GCC 5.2.1918273645SE +/- 0.11, N = 3SE +/- 0.18, N = 3SE +/- 0.17, N = 335.3832.9638.091. (CXX) g++ options: -O2 -pthread -lbz2 -lpthread

Smallpt

Global Illumination Renderer; 100 Samples

OpenBenchmarking.orgSeconds, Fewer Is BetterSmallpt 1.0Global Illumination Renderer; 100 SamplesGCC 4.8.4 - StockGCC 4.9.3GCC 5.2.1150300450600750SE +/- 1.33, N = 3SE +/- 1.15, N = 3SE +/- 0.88, N = 36146437181. (CXX) g++ options: -fopenmp

N-Queens

Elapsed Time

OpenBenchmarking.orgSeconds, Fewer Is BetterN-Queens 1.0Elapsed TimeGCC 4.8.4 - StockGCC 4.9.3GCC 5.2.1306090120150SE +/- 0.06, N = 3SE +/- 0.09, N = 3SE +/- 0.31, N = 3119.50122.10132.341. (CC) gcc options: -static -fopenmp -O3


Phoronix Test Suite v10.8.4