NVIDIA Jetson Nano Benchmarks

ARMv8 rev 1 testing with a jetson-nano and NVIDIA Tegra X1 on Ubuntu 18.04 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 1903316-HV-NVIDIAJET61
Jump To Table - Results

Statistics

Remove Outliers Before Calculating Averages

Graph Settings

Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
Jetson Nano
March 30 2019
  16 Hours, 5 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


NVIDIA Jetson Nano BenchmarksOpenBenchmarking.orgPhoronix Test SuiteARMv8 rev 1 @ 1.43GHz (4 Cores)jetson-nano4096MB32GB GB1QTNVIDIA Tegra X1VE228Realtek RTL8111/8168/8411Ubuntu 18.044.9.140-tegra (aarch64)Unity 7.5.0X Server 1.19.6NVIDIA 32.1.04.6.01.1.85GCC 7.3.0 + CUDA 10.0ext41920x1080ProcessorMotherboardMemoryDiskGraphicsMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLVulkanCompilerFile-SystemScreen ResolutionNVIDIA Jetson Nano Benchmarks PerformanceSystem Logs- --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-default-libstdcxx-abi=new --with-gcc-major-version-only -v - Scaling Governor: tegra-cpufreq schedutil- Python 2.7.15rc1 + Python 3.6.7

NVIDIA Jetson Nano Benchmarkscuda-mini-nbody: SOA Data Layoutcuda-mini-nbody: Flush Denormals To Zerocuda-mini-nbody: Loop Unrollingtensorrt-inference: GoogleNet - INT8 - 1 - Disabledcompress-xz: Compressing ubuntu-16.04.3-server-i386.img, Compression Level 9mbw: Memory Copy - 128 MiBtensorrt-inference: ResNet50 - INT8 - 8 - Disabledmbw: Memory Copy - 512 MiBtensorrt-inference: AlexNet - INT8 - 32 - Disabledtensorrt-inference: GoogleNet - INT8 - 4 - Disabledtensorrt-inference: AlexNet - INT8 - 16 - Disabledtensorrt-inference: VGG16 - FP16 - 1 - Disabledtensorrt-inference: AlexNet - FP16 - 16 - Disabledtensorrt-inference: ResNet50 - FP16 - 8 - Disabledtensorrt-inference: ResNet50 - INT8 - 4 - Disabledtensorrt-inference: AlexNet - FP16 - 4 - Disabledtensorrt-inference: AlexNet - FP16 - 8 - Disabledtensorrt-inference: VGG19 - FP16 - 4 - Disabledtensorrt-inference: AlexNet - FP16 - 1 - Disabledtensorrt-inference: VGG19 - FP16 - 1 - Disabledtensorrt-inference: VGG16 - FP16 - 8 - Disabledtensorrt-inference: VGG16 - FP16 - 4 - Disabledtensorrt-inference: ResNet50 - FP16 - 1 - Disabledtensorrt-inference: AlexNet - INT8 - 1 - Disabledtensorrt-inference: ResNet50 - FP16 - 4 - Disabledtensorrt-inference: AlexNet - INT8 - 4 - Disabledtensorrt-inference: ResNet50 - INT8 - 1 - Disabledtensorrt-inference: AlexNet - INT8 - 8 - Disabledmbw: Memory Copy, Fixed Block Size - 128 MiBtensorrt-inference: AlexNet - FP16 - 32 - Disabledmbw: Memory Copy, Fixed Block Size - 512 MiBtensorrt-inference: GoogleNet - FP16 - 8 - Disabledt-test1: 1t-test1: 2ramspeed: Add - Integertensorrt-inference: GoogleNet - FP16 - 1 - Disabledtensorrt-inference: ResNet152 - FP16 - 16 - Disabledtensorrt-inference: ResNet152 - FP16 - 32 - Disabledtensorrt-inference: GoogleNet - INT8 - 16 - Disabledtensorrt-inference: GoogleNet - INT8 - 32 - Disabledtensorrt-inference: GoogleNet - FP16 - 16 - Disabledtensorrt-inference: GoogleNet - FP16 - 32 - Disabledtensorrt-inference: ResNet50 - INT8 - 16 - Disabledtensorrt-inference: ResNet50 - INT8 - 32 - Disabledtensorrt-inference: ResNet50 - FP16 - 16 - Disabledtensorrt-inference: ResNet50 - FP16 - 32 - Disabledtensorrt-inference: ResNet152 - INT8 - 1 - Disabledtensorrt-inference: ResNet152 - FP16 - 8 - Disabledtensorrt-inference: ResNet152 - FP16 - 4 - Disabledramspeed: Copy - Integerramspeed: Triad - Integertensorrt-inference: GoogleNet - FP16 - 4 - Disabledramspeed: Scale - Integerramspeed: Average - Integercuda-mini-nbody: Cache Blockingglmark2: 1024 x 768glmark2: 1280 x 1024glmark2: 1920 x 1080lczero: BLAStensorrt-inference: ResNet152 - FP16 - 1 - Disabledlczero: CUDA + cuDNNj2dbench: Text Renderingj2dbench: Image Renderingj2dbench: Vector Graphics Renderingcompress-7zip: Compress Speed Testtensorrt-inference: GoogleNet - INT8 - 8 - Disabledglmark2: 800 x 600compress-zstd: Compressing ubuntu-16.04.3-server-i386.img, Compression Level 19cuda-mini-nbody: Originalbuild-linux-kernel: Time To Compilex264: H.264 Video EncodingJetson Nano3.663.668.9335.8744.433420.3722.163438.75128.5547.83113.7610.29168.8342.0520.59115.49133.7411.6154.868.7014.6014.1827.3740.4840.6582.3014.6192.543450.26202.293448.7685.9280.3127.357943.8365.6116.9817.2852.1955.4793.3398.7523.8225.0144.4946.265.4516.4215.789544.184856.0285.129141.597839.778.47136290464615.3410.091396226.12897658.51486283.59405049.181915127.284.092378.695.12OpenBenchmarking.org

CUDA Mini-Nbody

The CUDA version of Harrism's mini-nbody tests. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.org(NBody^2)/s, More Is BetterCUDA Mini-Nbody 2015-11-10Test: SOA Data LayoutJetson Nano0.82351.6472.47053.2944.1175SE +/- 0.00, N = 33.66

OpenBenchmarking.org(NBody^2)/s, More Is BetterCUDA Mini-Nbody 2015-11-10Test: Flush Denormals To ZeroJetson Nano0.82351.6472.47053.2944.1175SE +/- 0.00, N = 33.66

OpenBenchmarking.org(NBody^2)/s, More Is BetterCUDA Mini-Nbody 2015-11-10Test: Loop UnrollingJetson Nano246810SE +/- 0.03, N = 38.93

NVIDIA TensorRT Inference

This test profile uses any existing system installation of NVIDIA TensorRT for carrying out inference benchmarks with various neural networks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: GoogleNet - Precision: INT8 - Batch Size: 1 - DLA Cores: DisabledJetson Nano816243240SE +/- 0.50, N = 335.87

XZ Compression

This test measures the time needed to compress a sample file (an Ubuntu file-system image) using XZ compression. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterXZ Compression 5.2.4Compressing ubuntu-16.04.3-server-i386.img, Compression Level 9Jetson Nano1020304050SE +/- 0.86, N = 344.431. (CC) gcc options: -pthread -fvisibility=hidden -O2

MBW

This is a basic/simple memory (RAM) bandwidth benchmark for memory copy operations. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMiB/s, More Is BetterMBW 2018-09-08Test: Memory Copy - Array Size: 128 MiBJetson Nano7001400210028003500SE +/- 7.55, N = 33420.371. (CC) gcc options: -O3 -march=native

NVIDIA TensorRT Inference

This test profile uses any existing system installation of NVIDIA TensorRT for carrying out inference benchmarks with various neural networks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: ResNet50 - Precision: INT8 - Batch Size: 8 - DLA Cores: DisabledJetson Nano510152025SE +/- 0.15, N = 322.16

MBW

This is a basic/simple memory (RAM) bandwidth benchmark for memory copy operations. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMiB/s, More Is BetterMBW 2018-09-08Test: Memory Copy - Array Size: 512 MiBJetson Nano7001400210028003500SE +/- 12.45, N = 33438.751. (CC) gcc options: -O3 -march=native

NVIDIA TensorRT Inference

This test profile uses any existing system installation of NVIDIA TensorRT for carrying out inference benchmarks with various neural networks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: AlexNet - Precision: INT8 - Batch Size: 32 - DLA Cores: DisabledJetson Nano306090120150SE +/- 0.58, N = 3128.55

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: GoogleNet - Precision: INT8 - Batch Size: 4 - DLA Cores: DisabledJetson Nano1122334455SE +/- 0.39, N = 347.83

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: AlexNet - Precision: INT8 - Batch Size: 16 - DLA Cores: DisabledJetson Nano306090120150SE +/- 1.48, N = 3113.76

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: VGG16 - Precision: FP16 - Batch Size: 1 - DLA Cores: DisabledJetson Nano3691215SE +/- 0.13, N = 810.29

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: AlexNet - Precision: FP16 - Batch Size: 16 - DLA Cores: DisabledJetson Nano4080120160200SE +/- 1.25, N = 3168.83

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: ResNet50 - Precision: FP16 - Batch Size: 8 - DLA Cores: DisabledJetson Nano1020304050SE +/- 0.20, N = 342.05

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: ResNet50 - Precision: INT8 - Batch Size: 4 - DLA Cores: DisabledJetson Nano510152025SE +/- 0.30, N = 320.59

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: AlexNet - Precision: FP16 - Batch Size: 4 - DLA Cores: DisabledJetson Nano306090120150SE +/- 2.17, N = 12115.49

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: AlexNet - Precision: FP16 - Batch Size: 8 - DLA Cores: DisabledJetson Nano306090120150SE +/- 0.87, N = 3133.74

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: VGG19 - Precision: FP16 - Batch Size: 4 - DLA Cores: DisabledJetson Nano3691215SE +/- 0.08, N = 311.61

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: AlexNet - Precision: FP16 - Batch Size: 1 - DLA Cores: DisabledJetson Nano1224364860SE +/- 1.49, N = 954.86

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: VGG19 - Precision: FP16 - Batch Size: 1 - DLA Cores: DisabledJetson Nano246810SE +/- 0.09, N = 38.70

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: VGG16 - Precision: FP16 - Batch Size: 8 - DLA Cores: DisabledJetson Nano48121620SE +/- 0.00, N = 314.60

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: VGG16 - Precision: FP16 - Batch Size: 4 - DLA Cores: DisabledJetson Nano48121620SE +/- 0.08, N = 314.18

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: ResNet50 - Precision: FP16 - Batch Size: 1 - DLA Cores: DisabledJetson Nano612182430SE +/- 0.34, N = 927.37

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: AlexNet - Precision: INT8 - Batch Size: 1 - DLA Cores: DisabledJetson Nano918273645SE +/- 0.71, N = 340.48

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: ResNet50 - Precision: FP16 - Batch Size: 4 - DLA Cores: DisabledJetson Nano918273645SE +/- 0.26, N = 340.65

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: AlexNet - Precision: INT8 - Batch Size: 4 - DLA Cores: DisabledJetson Nano20406080100SE +/- 1.37, N = 482.30

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: ResNet50 - Precision: INT8 - Batch Size: 1 - DLA Cores: DisabledJetson Nano48121620SE +/- 0.12, N = 314.61

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: AlexNet - Precision: INT8 - Batch Size: 8 - DLA Cores: DisabledJetson Nano20406080100SE +/- 0.96, N = 392.54

MBW

This is a basic/simple memory (RAM) bandwidth benchmark for memory copy operations. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMiB/s, More Is BetterMBW 2018-09-08Test: Memory Copy, Fixed Block Size - Array Size: 128 MiBJetson Nano7001400210028003500SE +/- 7.52, N = 33450.261. (CC) gcc options: -O3 -march=native

NVIDIA TensorRT Inference

This test profile uses any existing system installation of NVIDIA TensorRT for carrying out inference benchmarks with various neural networks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: AlexNet - Precision: FP16 - Batch Size: 32 - DLA Cores: DisabledJetson Nano4080120160200SE +/- 0.75, N = 3202.29

MBW

This is a basic/simple memory (RAM) bandwidth benchmark for memory copy operations. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMiB/s, More Is BetterMBW 2018-09-08Test: Memory Copy, Fixed Block Size - Array Size: 512 MiBJetson Nano7001400210028003500SE +/- 14.16, N = 33448.761. (CC) gcc options: -O3 -march=native

NVIDIA TensorRT Inference

This test profile uses any existing system installation of NVIDIA TensorRT for carrying out inference benchmarks with various neural networks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: GoogleNet - Precision: FP16 - Batch Size: 8 - DLA Cores: DisabledJetson Nano20406080100SE +/- 0.10, N = 385.92

t-test1

This is a test of t-test1 for basic memory allocator benchmarks. Note this test profile is currently very basic and the overall time does include the warmup time of the custom t-test1 compilation. Improvements welcome. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is Bettert-test1 2017-01-13Threads: 1Jetson Nano20406080100SE +/- 0.23, N = 380.311. (CC) gcc options: -pthread

OpenBenchmarking.orgSeconds, Fewer Is Bettert-test1 2017-01-13Threads: 2Jetson Nano612182430SE +/- 0.07, N = 327.351. (CC) gcc options: -pthread

RAMspeed SMP

This benchmark tests the system memory (RAM) performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Add - Benchmark: IntegerJetson Nano2K4K6K8K10K7943.831. (CC) gcc options: -O3 -march=native

NVIDIA TensorRT Inference

This test profile uses any existing system installation of NVIDIA TensorRT for carrying out inference benchmarks with various neural networks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: GoogleNet - Precision: FP16 - Batch Size: 1 - DLA Cores: DisabledJetson Nano1530456075SE +/- 0.96, N = 965.61

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: ResNet152 - Precision: FP16 - Batch Size: 16 - DLA Cores: DisabledJetson Nano48121620SE +/- 0.04, N = 316.98

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: ResNet152 - Precision: FP16 - Batch Size: 32 - DLA Cores: DisabledJetson Nano48121620SE +/- 0.01, N = 317.28

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: GoogleNet - Precision: INT8 - Batch Size: 16 - DLA Cores: DisabledJetson Nano1224364860SE +/- 0.34, N = 352.19

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: GoogleNet - Precision: INT8 - Batch Size: 32 - DLA Cores: DisabledJetson Nano1224364860SE +/- 0.21, N = 355.47

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: GoogleNet - Precision: FP16 - Batch Size: 16 - DLA Cores: DisabledJetson Nano20406080100SE +/- 1.84, N = 393.33

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: GoogleNet - Precision: FP16 - Batch Size: 32 - DLA Cores: DisabledJetson Nano20406080100SE +/- 0.20, N = 398.75

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: ResNet50 - Precision: INT8 - Batch Size: 16 - DLA Cores: DisabledJetson Nano612182430SE +/- 0.03, N = 323.82

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: ResNet50 - Precision: INT8 - Batch Size: 32 - DLA Cores: DisabledJetson Nano612182430SE +/- 0.06, N = 325.01

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: ResNet50 - Precision: FP16 - Batch Size: 16 - DLA Cores: DisabledJetson Nano1020304050SE +/- 0.39, N = 344.49

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: ResNet50 - Precision: FP16 - Batch Size: 32 - DLA Cores: DisabledJetson Nano1020304050SE +/- 0.01, N = 346.26

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: ResNet152 - Precision: INT8 - Batch Size: 1 - DLA Cores: DisabledJetson Nano1.22632.45263.67894.90526.1315SE +/- 0.01, N = 35.45

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: ResNet152 - Precision: FP16 - Batch Size: 8 - DLA Cores: DisabledJetson Nano48121620SE +/- 0.08, N = 316.42

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: ResNet152 - Precision: FP16 - Batch Size: 4 - DLA Cores: DisabledJetson Nano48121620SE +/- 0.07, N = 315.78

RAMspeed SMP

This benchmark tests the system memory (RAM) performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Copy - Benchmark: IntegerJetson Nano2K4K6K8K10K9544.181. (CC) gcc options: -O3 -march=native

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Triad - Benchmark: IntegerJetson Nano100020003000400050004856.021. (CC) gcc options: -O3 -march=native

NVIDIA TensorRT Inference

This test profile uses any existing system installation of NVIDIA TensorRT for carrying out inference benchmarks with various neural networks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: GoogleNet - Precision: FP16 - Batch Size: 4 - DLA Cores: DisabledJetson Nano20406080100SE +/- 1.10, N = 1285.12

RAMspeed SMP

This benchmark tests the system memory (RAM) performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Scale - Benchmark: IntegerJetson Nano2K4K6K8K10K9141.591. (CC) gcc options: -O3 -march=native

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Average - Benchmark: IntegerJetson Nano2K4K6K8K10K7839.771. (CC) gcc options: -O3 -march=native

CUDA Mini-Nbody

The CUDA version of Harrism's mini-nbody tests. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.org(NBody^2)/s, More Is BetterCUDA Mini-Nbody 2015-11-10Test: Cache BlockingJetson Nano246810SE +/- 0.00, N = 38.47

GLmark2

This is a test of Linaro's glmark2 port, currently using the X11 OpenGL 2.0 target. GLmark2 is a basic OpenGL benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgScore, More Is BetterGLmark2 276Resolution: 1024 x 768Jetson Nano300600900120015001362

OpenBenchmarking.orgScore, More Is BetterGLmark2 276Resolution: 1280 x 1024Jetson Nano2004006008001000904

OpenBenchmarking.orgScore, More Is BetterGLmark2 276Resolution: 1920 x 1080Jetson Nano140280420560700646

LeelaChessZero

LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.20.1Backend: BLASJetson Nano48121620SE +/- 0.10, N = 315.341. (CXX) g++ options: -lpthread -lz

NVIDIA TensorRT Inference

This test profile uses any existing system installation of NVIDIA TensorRT for carrying out inference benchmarks with various neural networks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: ResNet152 - Precision: FP16 - Batch Size: 1 - DLA Cores: DisabledJetson Nano3691215SE +/- 0.05, N = 310.09

LeelaChessZero

LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.20.1Backend: CUDA + cuDNNJetson Nano306090120150SE +/- 0.64, N = 31391. (CXX) g++ options: -lpthread -lz

Java 2D Microbenchmark

This test runs a series of microbenchmarks to check the performance of the OpenGL-based Java 2D pipeline and the underlying OpenGL drivers. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgUnits Per Second, More Is BetterJava 2D Microbenchmark 1.0Rendering Test: Text RenderingJetson Nano13002600390052006500SE +/- 34.48, N = 46226.12

OpenBenchmarking.orgUnits Per Second, More Is BetterJava 2D Microbenchmark 1.0Rendering Test: Image RenderingJetson Nano200K400K600K800K1000KSE +/- 1827.16, N = 4897658.51

OpenBenchmarking.orgUnits Per Second, More Is BetterJava 2D Microbenchmark 1.0Rendering Test: Vector Graphics RenderingJetson Nano100K200K300K400K500KSE +/- 983.45, N = 4486283.59

7-Zip Compression

This is a test of 7-Zip using p7zip with its integrated benchmark feature or upstream 7-Zip for the Windows x64 build. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 16.02Compress Speed TestJetson Nano9001800270036004500SE +/- 17.21, N = 340501. (CXX) g++ options: -pipe -lpthread

NVIDIA TensorRT Inference

This test profile uses any existing system installation of NVIDIA TensorRT for carrying out inference benchmarks with various neural networks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgImages Per Second, More Is BetterNVIDIA TensorRT InferenceNeural Network: GoogleNet - Precision: INT8 - Batch Size: 8 - DLA Cores: DisabledJetson Nano1122334455SE +/- 0.47, N = 349.18

GLmark2

This is a test of Linaro's glmark2 port, currently using the X11 OpenGL 2.0 target. GLmark2 is a basic OpenGL benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgScore, More Is BetterGLmark2 276Resolution: 800 x 600Jetson Nano4008001200160020001915

Zstd Compression

This test measures the time needed to compress a sample file (an Ubuntu file-system image) using Zstd compression. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterZstd Compression 1.3.4Compressing ubuntu-16.04.3-server-i386.img, Compression Level 19Jetson Nano306090120150SE +/- 0.22, N = 3127.281. (CC) gcc options: -O3 -pthread -lz -llzma

CUDA Mini-Nbody

The CUDA version of Harrism's mini-nbody tests. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.org(NBody^2)/s, More Is BetterCUDA Mini-Nbody 2015-11-10Test: OriginalJetson Nano0.92031.84062.76093.68124.6015SE +/- 0.01, N = 34.09

Timed Linux Kernel Compilation

This test times how long it takes to build the Linux kernel in a default configuration. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 4.18Time To CompileJetson Nano5001000150020002500SE +/- 13.46, N = 32378.69

x264

This is a simple test of the x264 encoder run on the CPU (OpenCL support disabled) with a sample video file. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is Betterx264 2018-09-25H.264 Video EncodingJetson Nano1.1522.3043.4564.6085.76SE +/- 0.08, N = 35.121. (CC) gcc options: -ldl -lm -lpthread

71 Results Shown

CUDA Mini-Nbody:
  SOA Data Layout
  Flush Denormals To Zero
  Loop Unrolling
NVIDIA TensorRT Inference
XZ Compression
MBW
NVIDIA TensorRT Inference
MBW
NVIDIA TensorRT Inference:
  AlexNet - INT8 - 32 - Disabled
  GoogleNet - INT8 - 4 - Disabled
  AlexNet - INT8 - 16 - Disabled
  VGG16 - FP16 - 1 - Disabled
  AlexNet - FP16 - 16 - Disabled
  ResNet50 - FP16 - 8 - Disabled
  ResNet50 - INT8 - 4 - Disabled
  AlexNet - FP16 - 4 - Disabled
  AlexNet - FP16 - 8 - Disabled
  VGG19 - FP16 - 4 - Disabled
  AlexNet - FP16 - 1 - Disabled
  VGG19 - FP16 - 1 - Disabled
  VGG16 - FP16 - 8 - Disabled
  VGG16 - FP16 - 4 - Disabled
  ResNet50 - FP16 - 1 - Disabled
  AlexNet - INT8 - 1 - Disabled
  ResNet50 - FP16 - 4 - Disabled
  AlexNet - INT8 - 4 - Disabled
  ResNet50 - INT8 - 1 - Disabled
  AlexNet - INT8 - 8 - Disabled
MBW
NVIDIA TensorRT Inference
MBW
NVIDIA TensorRT Inference
t-test1:
  1
  2
RAMspeed SMP
NVIDIA TensorRT Inference:
  GoogleNet - FP16 - 1 - Disabled
  ResNet152 - FP16 - 16 - Disabled
  ResNet152 - FP16 - 32 - Disabled
  GoogleNet - INT8 - 16 - Disabled
  GoogleNet - INT8 - 32 - Disabled
  GoogleNet - FP16 - 16 - Disabled
  GoogleNet - FP16 - 32 - Disabled
  ResNet50 - INT8 - 16 - Disabled
  ResNet50 - INT8 - 32 - Disabled
  ResNet50 - FP16 - 16 - Disabled
  ResNet50 - FP16 - 32 - Disabled
  ResNet152 - INT8 - 1 - Disabled
  ResNet152 - FP16 - 8 - Disabled
  ResNet152 - FP16 - 4 - Disabled
RAMspeed SMP:
  Copy - Integer
  Triad - Integer
NVIDIA TensorRT Inference
RAMspeed SMP:
  Scale - Integer
  Average - Integer
CUDA Mini-Nbody
GLmark2:
  1024 x 768
  1280 x 1024
  1920 x 1080
LeelaChessZero
NVIDIA TensorRT Inference
LeelaChessZero
Java 2D Microbenchmark:
  Text Rendering
  Image Rendering
  Vector Graphics Rendering
7-Zip Compression
NVIDIA TensorRT Inference
GLmark2
Zstd Compression
CUDA Mini-Nbody
Timed Linux Kernel Compilation
x264