TX2 ARMv8 rev 3 testing on Ubuntu 16.04 via the Phoronix Test Suite.
TX26CORE Processor Test Processor: ARMv8 rev 3 @ 2.04GHz (6 Cores), Memory: 8192MB, Disk: 31GB 032G34, Graphics: NVIDIA Tegra X2 (nvgpu)/
OS: Ubuntu 16.04, Kernel: 4.4.38-tegra (aarch64), Desktop: Unity 7.4.0, Display Server: X Server 1.18.4, Display Driver: NVIDIA 28.1.0, OpenGL: 4.5.0, Compiler: GCC 5.4.0 20160609 + CUDA 8.0, File-System: ext4, Screen Resolution: 1920x1080
Compiler Notes: --build=aarch64-linux-gnu --disable-browser-plugin --disable-libquadmath --disable-werror --enable-checking=release --enable-clocale=gnu --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --target=aarch64-linux-gnu --with-arch-directory=aarch64 --with-default-libstdcxx-abi=new -vProcessor Notes: Scaling Governor: tegra_cpufreq schedutil
C-Ray This is a test of C-Ray, a simple raytracer designed to test the floating-point CPU performance. This test is multi-threaded (16 threads per core), will shoot 8 rays per pixel for anti-aliasing, and will generate a 1600 x 1200 image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time TX26CORE Processor Test 11 22 33 44 55 SE +/- 2.30, N = 6 48.53 1. (CC) gcc options: -lm -lpthread -O3
OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Read / Modify / Write TX26CORE Processor Test 4K 8K 12K 16K 20K SE +/- 105.69, N = 3 16383.82 1. (CC) gcc options: -lrt
CLOMP CLOMP is the C version of the Livermore OpenMP benchmark developed to measure OpenMP overheads and other performance impacts due to threading in order to influence future system designs. This particular test profile configuration is currently set to look at the OpenMP static schedule speed-up across all available CPU cores using the recommended test configuration. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Speedup, More Is Better CLOMP 3.3 Static OMP Speedup TX26CORE Processor Test 0.5918 1.1836 1.7754 2.3672 2.959 SE +/- 0.10, N = 10 2.63 1. (CC) gcc options: --openmp -O3 -lm
Crafty OpenBenchmarking.org Seconds, Fewer Is Better Crafty 23.4 Elapsed Time TX26CORE Processor Test 50 100 150 200 250 SE +/- 0.71, N = 3 227.56 1. (CC) gcc options: -lstdc++ -lm
FFTE FFTE is a package by Daisuke Takahashi to compute Discrete Fourier Transforms of 1-, 2- and 3- dimensional sequences of length (2^p)*(3^q)*(5^r). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MFLOPS, More Is Better FFTE 5.0 Test: N=64, 1D Complex FFT Routine TX26CORE Processor Test 900 1800 2700 3600 4500 SE +/- 48.13, N = 3 4324.03 1. (F9X) gfortran options: -O3 -fomit-frame-pointer -fopenmp -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
Fhourstones OpenBenchmarking.org Kpos / sec, More Is Better Fhourstones 3.1 Complex Connect-4 Solving TX26CORE Processor Test 1100 2200 3300 4400 5500 SE +/- 8.67, N = 3 5091.33 1. (CC) gcc options: -O3
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests to stress the system's CPU. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Blur TX26CORE Processor Test 13 26 39 52 65 SE +/- 0.33, N = 3 58 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -lwebp -ltiff -ljasper -ljpeg -lpng12 -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lgomp -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Sharpen TX26CORE Processor Test 16 32 48 64 80 SE +/- 0.33, N = 3 70 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -lwebp -ltiff -ljasper -ljpeg -lpng12 -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lgomp -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Resizing TX26CORE Processor Test 20 40 60 80 100 SE +/- 0.00, N = 3 95 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -lwebp -ltiff -ljasper -ljpeg -lpng12 -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lgomp -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: HWB Color Space TX26CORE Processor Test 20 40 60 80 100 SE +/- 0.58, N = 3 101 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -lwebp -ltiff -ljasper -ljpeg -lpng12 -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lgomp -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Local Adaptive Thresholding TX26CORE Processor Test 20 40 60 80 100 SE +/- 0.33, N = 3 85 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -lwebp -ltiff -ljasper -ljpeg -lpng12 -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lgomp -lpthread
HPC Challenge OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-HPL TX26CORE Processor Test 0.4361 0.8722 1.3083 1.7444 2.1805 SE +/- 0.00221, N = 3 1.93829 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops 2. BLAS + Open MPI 1.10.2
OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ffte TX26CORE Processor Test 0.3961 0.7922 1.1883 1.5844 1.9805 SE +/- 0.04459, N = 3 1.76061 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops 2. BLAS + Open MPI 1.10.2
OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: EP-DGEMM TX26CORE Processor Test 0.1553 0.3106 0.4659 0.6212 0.7765 SE +/- 0.02446, N = 3 0.69042 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops 2. BLAS + Open MPI 1.10.2
OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ptrans TX26CORE Processor Test 0.2736 0.5472 0.8208 1.0944 1.368 SE +/- 0.00267, N = 3 1.21616 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops 2. BLAS + Open MPI 1.10.2
OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: EP-STREAM Triad TX26CORE Processor Test 0.4517 0.9034 1.3551 1.8068 2.2585 SE +/- 0.00435, N = 3 2.00777 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops 2. BLAS + Open MPI 1.10.2
OpenBenchmarking.org GUP/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Random Access TX26CORE Processor Test 0.0035 0.007 0.0105 0.014 0.0175 SE +/- 0.00015, N = 3 0.01541 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops 2. BLAS + Open MPI 1.10.2
OpenBenchmarking.org usecs, Fewer Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Latency TX26CORE Processor Test 0.2109 0.4218 0.6327 0.8436 1.0545 SE +/- 0.18070, N = 3 0.93725 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops 2. BLAS + Open MPI 1.10.2
OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Bandwidth TX26CORE Processor Test 0.213 0.426 0.639 0.852 1.065 SE +/- 0.01072, N = 3 0.94654 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops 2. BLAS + Open MPI 1.10.2
OpenBenchmarking.org MB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Max Ping Pong Bandwidth TX26CORE Processor Test 1600 3200 4800 6400 8000 SE +/- 77.80, N = 3 7555.49 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops 2. BLAS + Open MPI 1.10.2
OpenBenchmarking.org MB/s, More Is Better lzbench 2017-08-08 Test: Zstd 1 TX26CORE Processor Test 30 60 90 120 150 SE +/- 0.33, N = 3 148 1. (CXX) g++ options: -lrt -static -lpthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
OpenBenchmarking.org MB/s, More Is Better lzbench 2017-08-08 Test: Brotli 0 TX26CORE Processor Test 40 80 120 160 200 SE +/- 0.33, N = 3 159 1. (CXX) g++ options: -lrt -static -lpthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
OpenBenchmarking.org MB/s, More Is Better lzbench 2017-08-08 Test: Libdeflate 1 TX26CORE Processor Test 20 40 60 80 100 SE +/- 0.33, N = 3 77 1. (CXX) g++ options: -lrt -static -lpthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
Minion Minion is an open-source constraint solver that is designed to be very scalable. This test profile uses Minion's integrated benchmarking problems to solve. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Minion 1.8 Benchmark: Graceful TX26CORE Processor Test 30 60 90 120 150 SE +/- 0.83, N = 3 122.22 1. (CXX) g++ options: -std=gnu++11 -O3 -fomit-frame-pointer -rdynamic
OpenBenchmarking.org Seconds, Fewer Is Better Minion 1.8 Benchmark: Solitaire TX26CORE Processor Test 70 140 210 280 350 SE +/- 3.24, N = 3 317.51 1. (CXX) g++ options: -std=gnu++11 -O3 -fomit-frame-pointer -rdynamic
OpenBenchmarking.org Seconds, Fewer Is Better Minion 1.8 Benchmark: Quasigroup TX26CORE Processor Test 60 120 180 240 300 SE +/- 0.51, N = 3 288.71 1. (CXX) g++ options: -std=gnu++11 -O3 -fomit-frame-pointer -rdynamic
Multichase Pointer Chaser OpenBenchmarking.org ns, Fewer Is Better Multichase Pointer Chaser Test: 4MB Array, 64 Byte Stride TX26CORE Processor Test 30 60 90 120 150 SE +/- 0.21, N = 3 142.40 1. (CC) gcc options: -O2 -static -pthread -lrt
NAS Parallel Benchmarks NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: BT.A TX26CORE Processor Test 300 600 900 1200 1500 SE +/- 14.98, N = 3 1321.92 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. Open MPI 1.10.2
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: EP.C TX26CORE Processor Test 20 40 60 80 100 SE +/- 1.01, N = 3 103.86 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. Open MPI 1.10.2
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: FT.A TX26CORE Processor Test 500 1000 1500 2000 2500 SE +/- 34.64, N = 4 2143.61 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. Open MPI 1.10.2
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: FT.B TX26CORE Processor Test 500 1000 1500 2000 2500 SE +/- 14.27, N = 3 2153.52 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. Open MPI 1.10.2
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: LU.A TX26CORE Processor Test 800 1600 2400 3200 4000 SE +/- 43.46, N = 3 3812.03 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. Open MPI 1.10.2
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: LU.C TX26CORE Processor Test 600 1200 1800 2400 3000 SE +/- 6.70, N = 3 2935.56 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. Open MPI 1.10.2
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: SP.A TX26CORE Processor Test 110 220 330 440 550 SE +/- 4.80, N = 3 494.15 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. Open MPI 1.10.2
Parboil The Parboil Benchmarks from the IMPACT Research Group at University of Illinois are a set of throughput computing applications for looking at computing architecture and compilers. Parboil test-cases support OpenMP, OpenCL, and CUDA multi-processing environments. However, at this time the test profile is just making use of the OpenMP and OpenCL test workloads. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Parboil 2.5 Test: OpenMP LBM TX26CORE Processor Test 120 240 360 480 600 SE +/- 0.79, N = 3 543.87 1. (CXX) g++ options: -lm -lpthread -lgomp -ffast-math -fopenmp
OpenBenchmarking.org Seconds, Fewer Is Better Parboil 2.5 Test: OpenMP CUTCP TX26CORE Processor Test 20 40 60 80 100 SE +/- 0.50, N = 3 79.51 1. (CXX) g++ options: -lm -lpthread -lgomp -ffast-math -fopenmp
OpenBenchmarking.org Seconds, Fewer Is Better Parboil 2.5 Test: OpenMP Stencil TX26CORE Processor Test 15 30 45 60 75 SE +/- 0.06, N = 3 67.83 1. (CXX) g++ options: -lm -lpthread -lgomp -ffast-math -fopenmp
OpenBenchmarking.org Seconds, Fewer Is Better Parboil 2.5 Test: OpenMP MRI Gridding TX26CORE Processor Test 60 120 180 240 300 SE +/- 6.94, N = 6 263.55 1. (CXX) g++ options: -lm -lpthread -lgomp -ffast-math -fopenmp
Perl Benchmarks OpenBenchmarking.org seconds, Fewer Is Better Perl Benchmarks Test: Pod2html TX26CORE Processor Test 0.0969 0.1938 0.2907 0.3876 0.4845 SE +/- 0.00056902, N = 3 0.43082992
PolyBench-C OpenBenchmarking.org Seconds, Fewer Is Better PolyBench-C 3.2 Test: Covariance Computation TX26CORE Processor Test 2 4 6 8 10 SE +/- 0.01, N = 3 6.05 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org Seconds, Fewer Is Better PolyBench-C 3.2 Test: Correlation Computation TX26CORE Processor Test 2 4 6 8 10 SE +/- 0.01, N = 3 6.05 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org Seconds, Fewer Is Better PolyBench-C 3.2 Test: 3 Matrix Multiplications TX26CORE Processor Test 20 40 60 80 100 SE +/- 0.07, N = 3 89.43 1. (CC) gcc options: -O3 -march=native
Rodinia Rodinia is a suite focused upon accelerating compute-intensive applications with accelerators. CUDA, OpenMP, and OpenCL parallel models are supported by the included applications. This profile utilizes the OpenCL and OpenMP test binaries at the moment. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 2.4 Test: OpenMP LavaMD TX26CORE Processor Test 110 220 330 440 550 SE +/- 2.96, N = 3 526.43 1. (CXX) g++ options: -O2 -lOpenCL
OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 2.4 Test: OpenMP CFD Solver TX26CORE Processor Test 50 100 150 200 250 SE +/- 0.71, N = 3 212.13 1. (CXX) g++ options: -O2 -lOpenCL
OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 2.4 Test: OpenMP Streamcluster TX26CORE Processor Test 20 40 60 80 100 SE +/- 0.18, N = 3 84.59 1. (CXX) g++ options: -O2 -lOpenCL
SciMark OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Composite TX26CORE Processor Test 60 120 180 240 300 SE +/- 4.56, N = 8 281.07 1. (CC) gcc options: -lm
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo TX26CORE Processor Test 30 60 90 120 150 SE +/- 0.03, N = 4 118.56 1. (CC) gcc options: -lm
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform TX26CORE Processor Test 15 30 45 60 75 SE +/- 0.16, N = 4 66.72 1. (CC) gcc options: -lm
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply TX26CORE Processor Test 90 180 270 360 450 SE +/- 2.34, N = 4 424.37 1. (CC) gcc options: -lm
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization TX26CORE Processor Test 90 180 270 360 450 SE +/- 46.23, N = 4 427.10 1. (CC) gcc options: -lm
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation TX26CORE Processor Test 80 160 240 320 400 SE +/- 8.54, N = 4 352.42 1. (CC) gcc options: -lm
Smallpt OpenBenchmarking.org Seconds, Fewer Is Better Smallpt 1.0 Global Illumination Renderer; 100 Samples TX26CORE Processor Test 40 80 120 160 200 SE +/- 3.06, N = 5 202 1. (CXX) g++ options: -fopenmp
Stockfish OpenBenchmarking.org ms, Fewer Is Better Stockfish 2014-11-26 Total Time TX26CORE Processor Test 2K 4K 6K 8K 10K SE +/- 77.49, N = 3 10284 1. (CXX) g++ options: -lpthread -fno-exceptions -fno-rtti -ansi -pedantic -O3 -flto
TX26CORE Processor Test Processor: ARMv8 rev 3 @ 2.04GHz (6 Cores), Memory: 8192MB, Disk: 31GB 032G34, Graphics: NVIDIA Tegra X2 (nvgpu)/
OS: Ubuntu 16.04, Kernel: 4.4.38-tegra (aarch64), Desktop: Unity 7.4.0, Display Server: X Server 1.18.4, Display Driver: NVIDIA 28.1.0, OpenGL: 4.5.0, Compiler: GCC 5.4.0 20160609 + CUDA 8.0, File-System: ext4, Screen Resolution: 1920x1080
Compiler Notes: --build=aarch64-linux-gnu --disable-browser-plugin --disable-libquadmath --disable-werror --enable-checking=release --enable-clocale=gnu --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --target=aarch64-linux-gnu --with-arch-directory=aarch64 --with-default-libstdcxx-abi=new -vProcessor Notes: Scaling Governor: tegra_cpufreq schedutil
Testing initiated at 14 December 2017 12:41 by user ubuntu.