GCC Jetson TX1 GCC 4.8 4.9 5.2 Benchmarking Cortex A57 rev 1 testing NVIDIA Jetson Tegra X1 compiler benchmarking. GCC 4.8 through GCC 5.2.1 stable. Benchmarks by Michael Larabel for a future article on Phoronix.com.
HTML result view exported from: https://openbenchmarking.org/result/1511182-HA-GCCJETSON86&sor&grs .
GCC Jetson TX1 GCC 4.8 4.9 5.2 Benchmarking Processor Motherboard Memory Disk Graphics OS Kernel Desktop Display Server Display Driver Compiler File-System Screen Resolution GCC 4.8.4 - Stock GCC 4.9.3 GCC 5.2.1 Cortex A57 rev 1 @ 1.91GHz (4 Cores) jetson_tx1 4096MB 16GB 016G32 NVIDIA TEGRA Ubuntu 14.04 3.10.67-g3a5c467 (aarch64) Unity 7.2.2 X Server 1.15.1 NVIDIA 1.0.0 GCC 4.8.4 + CUDA 7.0 ext4 3840x2160 GCC 4.9.3 + CUDA 7.0 GCC 5.2.1 20151031 + CUDA 7.0 OpenBenchmarking.org Compiler Details - GCC 4.8.4 - Stock: --build=arm-linux-gnueabihf --disable-browser-plugin --disable-libitm --disable-libmudflap --disable-libquadmath --disable-sjlj-exceptions --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf --with-arch-directory=arm --with-arch=armv7-a --with-float=hard --with-fpu=vfpv3-d16 --with-mode=thumb -v - GCC 4.9.3: --build=arm-linux-gnueabihf --disable-browser-plugin --disable-libitm --disable-libquadmath --disable-sjlj-exceptions --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-multilib --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf --with-arch-directory=arm --with-arch=armv7-a --with-float=hard --with-fpu=vfpv3-d16 --with-mode=thumb -v - GCC 5.2.1: --build=arm-linux-gnueabihf --disable-browser-plugin --disable-libitm --disable-libquadmath --disable-libstdcxx-dual-abi --disable-sjlj-exceptions --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-multilib --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf --with-arch-directory=arm --with-arch=armv7-a --with-default-libstdcxx-abi=gcc4-compatible --with-float=hard --with-fpu=vfpv3-d16 --with-mode=thumb -v Processor Details - Scaling Governor: tegra interactive
GCC Jetson TX1 GCC 4.8 4.9 5.2 Benchmarking fftw: Stock - 2D FFT Size 1024 npb: LU.A john-the-ripper: MD5 npb: BT.A npb: SP.A npb: EP.B smallpt: Global Illumination Renderer; 100 Samples compress-pbzip2: 256MB File Compression n-queens: Elapsed Time compress-7zip: Compress Speed Test scimark2: Dense LU Matrix Factorization scimark2: Jacobi Successive Over-Relaxation scimark2: Sparse Matrix Multiply dolfyn: Computational Fluid Dynamics scimark2: Composite ffte: N=64, 1D Complex FFT Routine scimark2: Monte Carlo c-ray: Total Time vpxenc: vpxenc scimark2: Fast Fourier Transform GCC 4.8.4 - Stock GCC 4.9.3 GCC 5.2.1 189.48 2191.56 32229 3769.58 1626.27 49.85 614 35.38 119.50 4294 700.46 448.16 463.17 108.79 367.45 2005.31 187.50 93.58 9.20 37.94 353.92 2018.76 30269 3577.54 1542.82 49.85 643 32.96 122.10 4171 680.43 428.29 458.87 109.68 363.59 2041.50 186.75 101.35 7.82 63.60 306.19 1196.56 21894 2781.28 1215.11 42.44 718 38.09 132.34 3960 654.41 427.92 443.31 113.06 354.41 2040.03 188.55 96.67 7.27 57.86 OpenBenchmarking.org
FFTW Build: Stock - Size: 2D FFT Size 1024 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.4 Build: Stock - Size: 2D FFT Size 1024 GCC 4.9.3 GCC 5.2.1 GCC 4.8.4 - Stock 80 160 240 320 400 SE +/- 3.00, N = 5 SE +/- 1.06, N = 5 SE +/- 2.04, N = 5 353.92 306.19 189.48 -std=gnu99 -std=gnu99 1. (CC) gcc options: -O3 -fomit-frame-pointer -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
NAS Parallel Benchmarks Test / Class: LU.A OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: LU.A GCC 4.8.4 - Stock GCC 4.9.3 GCC 5.2.1 500 1000 1500 2000 2500 SE +/- 4.35, N = 3 SE +/- 5.25, N = 3 SE +/- 12.89, N = 3 2191.56 2018.76 1196.56 1. (F9X) gfortran options: -fopenmp
John The Ripper Test: MD5 OpenBenchmarking.org Real C/S, More Is Better John The Ripper 1.8.0 Test: MD5 GCC 4.8.4 - Stock GCC 4.9.3 GCC 5.2.1 7K 14K 21K 28K 35K SE +/- 0.00, N = 3 SE +/- 101.26, N = 3 SE +/- 114.62, N = 3 32229 30269 21894 1. (CC) gcc options: -fopenmp
NAS Parallel Benchmarks Test / Class: BT.A OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: BT.A GCC 4.8.4 - Stock GCC 4.9.3 GCC 5.2.1 800 1600 2400 3200 4000 SE +/- 8.11, N = 3 SE +/- 4.13, N = 3 SE +/- 7.85, N = 3 3769.58 3577.54 2781.28 1. (F9X) gfortran options: -fopenmp
NAS Parallel Benchmarks Test / Class: SP.A OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: SP.A GCC 4.8.4 - Stock GCC 4.9.3 GCC 5.2.1 300 600 900 1200 1500 SE +/- 3.63, N = 3 SE +/- 1.48, N = 3 SE +/- 4.54, N = 3 1626.27 1542.82 1215.11 1. (F9X) gfortran options: -fopenmp
NAS Parallel Benchmarks Test / Class: EP.B OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: EP.B GCC 4.9.3 GCC 4.8.4 - Stock GCC 5.2.1 11 22 33 44 55 SE +/- 0.07, N = 3 SE +/- 0.85, N = 3 SE +/- 0.57, N = 3 49.85 49.85 42.44 1. (F9X) gfortran options: -fopenmp
Smallpt Global Illumination Renderer; 100 Samples OpenBenchmarking.org Seconds, Fewer Is Better Smallpt 1.0 Global Illumination Renderer; 100 Samples GCC 4.8.4 - Stock GCC 4.9.3 GCC 5.2.1 150 300 450 600 750 SE +/- 1.33, N = 3 SE +/- 1.15, N = 3 SE +/- 0.88, N = 3 614 643 718 1. (CXX) g++ options: -fopenmp
Parallel BZIP2 Compression 256MB File Compression OpenBenchmarking.org Seconds, Fewer Is Better Parallel BZIP2 Compression 1.1.6 256MB File Compression GCC 4.9.3 GCC 4.8.4 - Stock GCC 5.2.1 9 18 27 36 45 SE +/- 0.18, N = 3 SE +/- 0.11, N = 3 SE +/- 0.17, N = 3 32.96 35.38 38.09 1. (CXX) g++ options: -O2 -pthread -lbz2 -lpthread
N-Queens Elapsed Time OpenBenchmarking.org Seconds, Fewer Is Better N-Queens 1.0 Elapsed Time GCC 4.8.4 - Stock GCC 4.9.3 GCC 5.2.1 30 60 90 120 150 SE +/- 0.06, N = 3 SE +/- 0.09, N = 3 SE +/- 0.31, N = 3 119.50 122.10 132.34 1. (CC) gcc options: -static -fopenmp -O3
7-Zip Compression Compress Speed Test OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 9.20.1 Compress Speed Test GCC 4.8.4 - Stock GCC 4.9.3 GCC 5.2.1 900 1800 2700 3600 4500 SE +/- 19.35, N = 3 SE +/- 5.49, N = 3 SE +/- 8.25, N = 3 4294 4171 3960 1. (CXX) g++ options: -pipe -lpthread
SciMark Computational Test: Dense LU Matrix Factorization OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization GCC 4.8.4 - Stock GCC 4.9.3 GCC 5.2.1 150 300 450 600 750 SE +/- 0.61, N = 4 SE +/- 1.92, N = 4 SE +/- 2.84, N = 4 700.46 680.43 654.41
SciMark Computational Test: Jacobi Successive Over-Relaxation OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation GCC 4.8.4 - Stock GCC 4.9.3 GCC 5.2.1 100 200 300 400 500 SE +/- 0.07, N = 4 SE +/- 0.15, N = 4 SE +/- 0.20, N = 4 448.16 428.29 427.92
SciMark Computational Test: Sparse Matrix Multiply OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply GCC 4.8.4 - Stock GCC 4.9.3 GCC 5.2.1 100 200 300 400 500 SE +/- 0.51, N = 4 SE +/- 0.76, N = 4 SE +/- 1.66, N = 4 463.17 458.87 443.31
Dolfyn Computational Fluid Dynamics OpenBenchmarking.org Seconds, Fewer Is Better Dolfyn 0.527 Computational Fluid Dynamics GCC 4.8.4 - Stock GCC 4.9.3 GCC 5.2.1 30 60 90 120 150 SE +/- 0.28, N = 3 SE +/- 0.01, N = 3 SE +/- 0.24, N = 3 108.79 109.68 113.06
SciMark Computational Test: Composite OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Composite GCC 4.8.4 - Stock GCC 4.9.3 GCC 5.2.1 80 160 240 320 400 SE +/- 0.28, N = 4 SE +/- 0.59, N = 4 SE +/- 0.87, N = 4 367.45 363.59 354.41
FFTE Test: N=64, 1D Complex FFT Routine OpenBenchmarking.org MFLOPS, More Is Better FFTE 5.0 Test: N=64, 1D Complex FFT Routine GCC 4.9.3 GCC 5.2.1 GCC 4.8.4 - Stock 400 800 1200 1600 2000 SE +/- 0.36, N = 3 SE +/- 0.33, N = 3 SE +/- 35.22, N = 6 2041.50 2040.03 2005.31 1. (F9X) gfortran options: -O3 -fomit-frame-pointer -fopenmp -pthread -lmpi_f90 -lmpi_f77 -lmpi -ldl -lhwloc
SciMark Computational Test: Monte Carlo OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo GCC 5.2.1 GCC 4.8.4 - Stock GCC 4.9.3 40 80 120 160 200 SE +/- 0.91, N = 4 SE +/- 0.82, N = 4 SE +/- 0.97, N = 4 188.55 187.50 186.75
C-Ray Total Time OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time GCC 4.8.4 - Stock GCC 5.2.1 GCC 4.9.3 20 40 60 80 100 SE +/- 6.00, N = 6 SE +/- 4.92, N = 6 SE +/- 7.19, N = 6 93.58 96.67 101.35 1. (CC) gcc options: -lm -lpthread -O3
VP8 libvpx Encoding vpxenc OpenBenchmarking.org Frames Per Second, More Is Better VP8 libvpx Encoding 1.3.0 vpxenc GCC 4.8.4 - Stock GCC 4.9.3 GCC 5.2.1 3 6 9 12 15 SE +/- 0.29, N = 6 SE +/- 0.33, N = 6 SE +/- 0.24, N = 6 9.20 7.82 7.27 1. (CXX) g++ options: -lvpx -lgtest -lpthread -lm -O3
SciMark Computational Test: Fast Fourier Transform OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform GCC 4.9.3 GCC 5.2.1 GCC 4.8.4 - Stock 14 28 42 56 70 SE +/- 0.39, N = 4 SE +/- 2.03, N = 4 SE +/- 0.22, N = 4 63.60 57.86 37.94
Phoronix Test Suite v10.8.4