Benchmarking the effect of unrolling on AARCH64

Tegra X1 vs S905 1.5GHz Android TV boxen

HTML result view exported from: https://openbenchmarking.org/result/1702288-RI-1609085HA82.

Benchmarking the effect of unrolling on AARCH64ProcessorMotherboardMemoryDiskGraphicsOSKernelDisplay DriverCompilerFile-SystemScreen ResolutionNVIDIA Tegra X1MXQ PRO+ DebianMini MXIII GCC7Mini MXIII GCC5Mini MXIII GCC5 unrolledMini MXIII GCC7 unrolledCortex A57 rev 1 @ 1.91GHz (4 Cores)foster_e_hdd3072MB500GB Seagate ST500LM000-1EJ16 + 16GB SDW16G + 32GB 00000NVIDIA TEGRAUbuntu 14.103.10.61 (aarch64)fbdev 0.4.4GCC 4.9.1 + CUDA 6.5ext41920x2400AArch64 rev 4 @ 2.02GHz (4 Cores)Amlogic2048MB60GB A + 16GB AGND3R + 16GB SD16GDebian 8.33.14.65-odroidc2 (aarch64)GCC 4.9.21280x1440Unknown @ 1.50GHz (4 Cores)16GB NCard + 32GB 00000Ubuntu 16.043.14.65-61 (aarch64)GCC 7.0.0 20160904 + LLVM 3.8.0GCC 5.3.1 20160413 + LLVM 3.8.0Unknown @ 1.54GHz (4 Cores)Amlogic3.14.79-vegas95 (aarch64)GCC 5.4.0 20160609 + Clang 3.8.0-2ubuntu4 + LLVM 3.8.0GCC 7.0.1 20170220 + Clang 3.8.0-2ubuntu4 + LLVM 3.8.0OpenBenchmarking.orgCompiler Details- NVIDIA Tegra X1: --build=aarch64-linux-gnu --disable-browser-plugin --disable-libquadmath --disable-libsanitizer --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --target=aarch64-linux-gnu --with-arch-directory=arm64 -v - MXQ PRO+ Debian: --build=aarch64-linux-gnu --disable-browser-plugin --disable-libquadmath --disable-libsanitizer --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --target=aarch64-linux-gnu --with-arch-directory=arm64 -v - Mini MXIII GCC7: --build=aarch64-linux-gnu --disable-bootstrap --disable-browser-plugin --disable-libquadmath --disable-werror --enable-checking=release --enable-clocale=gnu --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,c++,fortran --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --target=aarch64-linux-gnu --with-arch-directory=aarch64 --with-default-libstdcxx-abi=new - Mini MXIII GCC5: --build=aarch64-linux-gnu --disable-browser-plugin --disable-libquadmath --disable-werror --enable-checking=release --enable-clocale=gnu --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --target=aarch64-linux-gnu --with-arch-directory=aarch64 --with-default-libstdcxx-abi=new -v - Mini MXIII GCC5 unrolled: --build=aarch64-linux-gnu --disable-browser-plugin --disable-libquadmath --disable-werror --enable-checking=release --enable-clocale=gnu --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --target=aarch64-linux-gnu --with-arch-directory=aarch64 --with-default-libstdcxx-abi=new -v - Mini MXIII GCC7 unrolled: --build=aarch64-linux-gnu --disable-browser-plugin --disable-libquadmath --disable-werror --enable-checking=release --enable-clocale=gnu --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,c++,fortran --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --target=aarch64-linux-gnu --with-arch-directory=aarch64 --with-default-libstdcxx-abi=new Processor Details- NVIDIA Tegra X1: Scaling Governor: tegra performance- MXQ PRO+ Debian: Scaling Governor: meson_cpufreq performance- Mini MXIII GCC7: Scaling Governor: meson_cpufreq performance- Mini MXIII GCC5: Scaling Governor: meson_cpufreq performance- Mini MXIII GCC5 unrolled: Scaling Governor: meson_cpufreq performance- Mini MXIII GCC7 unrolled: Scaling Governor: meson_cpufreq performance

Benchmarking the effect of unrolling on AARCH64fftw: Stock - 2D FFT Size 2048fhourstones: Complex Connect-4 Solvingvpxenc: vpxencbuild-apache: Time To Compilec-ray: Total Timeprimesieve: 1e12 Prime Number Generationsmallpt: Global Illumination Renderer; 100 Samplesstockfish: Total Timeencode-flac: WAV To FLACffmpeg: H.264 HD To NTSC DVn-queens: Elapsed Timepgbench: Buffer Test - Normal Load - Read Writepgbench: Buffer Test - Single Thread - Read Writeredis: GETredis: SETNVIDIA Tegra X1MXQ PRO+ DebianMini MXIII GCC7Mini MXIII GCC5Mini MXIII GCC5 unrolledMini MXIII GCC7 unrolled215.664705.5711.52134.5184.74340.5312111143140.0074.91109.64510.88223.82615979.06432160.48108.882619.875.78535.29182.11656.7121922793187.93222.78164.23652.15281.37247721.10186038.34196.933136.276.81245.49192.70575.3116920504153.48191.79156.09598.25144.95264703.30194408.30196.793044.936.69250.33186.85561.5017922375158.16195.16152.03626.65210.77270455.57197144.29194.663228.876.75270.46157.42570.5616922197162.45151.28140.45543.32202.28256786.90197794.05188.143446.177.27267.69151.77524.8816620446163.56150.39154.18619.53194.57272703.28200337.42OpenBenchmarking.org

FFTW

Build: Stock - Size: 2D FFT Size 2048

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.4Build: Stock - Size: 2D FFT Size 2048NVIDIA Tegra X1MXQ PRO+ DebianMini MXIII GCC7Mini MXIII GCC5Mini MXIII GCC5 unrolledMini MXIII GCC7 unrolled50100150200250SE +/- 0.96, N = 5SE +/- 0.11, N = 5SE +/- 0.13, N = 5SE +/- 0.12, N = 5SE +/- 0.13, N = 5SE +/- 0.16, N = 5215.66108.88196.93196.79194.66188.14-std=gnu99 -O3 -fstrict-aliasing -fno-schedule-insns -ffast-math-std=gnu99 -mcpu=cortex-a53 -O3 -fipa-pta -march=armv8-a+crc -ftree-vectorize -ffast-math-O3 -mcpu=cortex-a53 -fipa-pta -march=armv8-a+crc-O3 -mcpu=cortex-a53 -fipa-pta -march=armv8-a+crc-Ofast -mcpu=thunderx -fipa-pta -march=armv8-a+crc -ftree-vectorize -funroll-loops -ftree-loop-ivcanon -fivopts-Ofast -mcpu=thunderx -fipa-pta -march=armv8-a+crc -ftree-vectorize -funroll-loops -ftree-loop-ivcanon -fivopts1. (CC) gcc options: -fomit-frame-pointer -lm

Fhourstones

Complex Connect-4 Solving

OpenBenchmarking.orgKpos / sec, More Is BetterFhourstones 3.1Complex Connect-4 SolvingNVIDIA Tegra X1MXQ PRO+ DebianMini MXIII GCC7Mini MXIII GCC5Mini MXIII GCC5 unrolledMini MXIII GCC7 unrolled10002000300040005000SE +/- 1.07, N = 3SE +/- 0.95, N = 3SE +/- 2.43, N = 3SE +/- 1.77, N = 3SE +/- 0.72, N = 3SE +/- 1.88, N = 34705.572619.873136.273044.933228.873446.171. (CC) gcc options: -O3

VP8 libvpx Encoding

vpxenc

OpenBenchmarking.orgFrames Per Second, More Is BetterVP8 libvpx Encoding 1.3.0vpxencNVIDIA Tegra X1MXQ PRO+ DebianMini MXIII GCC7Mini MXIII GCC5Mini MXIII GCC5 unrolledMini MXIII GCC7 unrolled3691215SE +/- 0.15, N = 3SE +/- 0.10, N = 6SE +/- 0.09, N = 3SE +/- 0.03, N = 3SE +/- 0.11, N = 4SE +/- 0.11, N = 611.525.786.816.696.757.27-mcpu=cortex-a53 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -flto -ffat-lto-objects -ftree-vectorize -fuse-linker-plugin-mcpu=cortex-a53 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc-mcpu=cortex-a53 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc-Ofast -mcpu=thunderx -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -ftree-vectorize -funroll-loops -ftree-loop-ivcanon -fivopts1. (CXX) g++ options: -lvpx -lgtest -lpthread -lm -O3

Timed Apache Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Apache Compilation 2.4.7Time To CompileNVIDIA Tegra X1MXQ PRO+ DebianMini MXIII GCC7Mini MXIII GCC5Mini MXIII GCC5 unrolledMini MXIII GCC7 unrolled120240360480600SE +/- 1.47, N = 3SE +/- 0.29, N = 3SE +/- 0.03, N = 3SE +/- 0.25, N = 3SE +/- 0.39, N = 3SE +/- 0.25, N = 3134.51535.29245.49250.33270.46267.69

C-Ray

Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total TimeNVIDIA Tegra X1MXQ PRO+ DebianMini MXIII GCC7Mini MXIII GCC5Mini MXIII GCC5 unrolledMini MXIII GCC7 unrolled4080120160200SE +/- 0.54, N = 3SE +/- 0.71, N = 3SE +/- 0.20, N = 3SE +/- 3.19, N = 4SE +/- 1.52, N = 3SE +/- 0.02, N = 384.74182.11192.70186.85157.42151.77-mcpu=cortex-a53 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -flto -ffat-lto-objects -ftree-vectorize -fuse-linker-plugin-mcpu=cortex-a53 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc-mcpu=cortex-a53 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc-Ofast -mcpu=thunderx -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -ftree-vectorize -funroll-loops -ftree-loop-ivcanon -fivopts-Ofast -mcpu=thunderx -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -ftree-vectorize -funroll-loops -ftree-loop-ivcanon -fivopts1. (CC) gcc options: -lm -lpthread -O3

Primesieve

1e12 Prime Number Generation

OpenBenchmarking.orgSeconds, Fewer Is BetterPrimesieve 5.4.21e12 Prime Number GenerationNVIDIA Tegra X1MXQ PRO+ DebianMini MXIII GCC7Mini MXIII GCC5Mini MXIII GCC5 unrolledMini MXIII GCC7 unrolled140280420560700SE +/- 7.87, N = 6SE +/- 13.09, N = 3SE +/- 9.24, N = 4SE +/- 6.91, N = 3SE +/- 11.36, N = 3SE +/- 1.53, N = 3340.53656.71575.31561.50570.56524.88-O2-mcpu=cortex-a53 -O3 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -flto -ffat-lto-objects -ftree-vectorize -fuse-linker-plugin-O3 -mcpu=cortex-a53 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc-O3 -mcpu=cortex-a53 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc-Ofast -mcpu=thunderx -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -ftree-vectorize -funroll-loops -ftree-loop-ivcanon -fivopts-Ofast -mcpu=thunderx -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -ftree-vectorize -funroll-loops -ftree-loop-ivcanon -fivopts1. (CXX) g++ options: -fopenmp

Smallpt

Global Illumination Renderer; 100 Samples

OpenBenchmarking.orgSeconds, Fewer Is BetterSmallpt 1.0Global Illumination Renderer; 100 SamplesNVIDIA Tegra X1MXQ PRO+ DebianMini MXIII GCC7Mini MXIII GCC5Mini MXIII GCC5 unrolledMini MXIII GCC7 unrolled30060090012001500SE +/- 1.45, N = 3SE +/- 0.67, N = 3SE +/- 0.33, N = 3SE +/- 0.33, N = 3SE +/- 0.33, N = 3SE +/- 0.00, N = 31211219169179169166-mcpu=cortex-a53 -O3 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -flto -ffat-lto-objects -ftree-vectorize -fuse-linker-plugin-O3 -mcpu=cortex-a53 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc-O3 -mcpu=cortex-a53 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc-Ofast -mcpu=thunderx -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -ftree-vectorize -funroll-loops -ftree-loop-ivcanon -fivopts-Ofast -mcpu=thunderx -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -ftree-vectorize -funroll-loops -ftree-loop-ivcanon -fivopts1. (CXX) g++ options: -fopenmp

Stockfish

Total Time

OpenBenchmarking.orgms, Fewer Is BetterStockfish 2014-11-26Total TimeNVIDIA Tegra X1MXQ PRO+ DebianMini MXIII GCC7Mini MXIII GCC5Mini MXIII GCC5 unrolledMini MXIII GCC7 unrolled5K10K15K20K25KSE +/- 45.88, N = 3SE +/- 376.99, N = 3SE +/- 78.03, N = 3SE +/- 85.99, N = 3SE +/- 70.24, N = 3SE +/- 11.72, N = 3114312279320504223752219720446-mcpu=cortex-a53 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -ffat-lto-objects -ftree-vectorize -fuse-linker-plugin-mcpu=cortex-a53 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc-mcpu=cortex-a53 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc-Ofast -mcpu=thunderx -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -ftree-vectorize -funroll-loops -ftree-loop-ivcanon -fivopts-Ofast -mcpu=thunderx -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -ftree-vectorize -funroll-loops -ftree-loop-ivcanon -fivopts1. (CXX) g++ options: -lpthread -fno-exceptions -fno-rtti -ansi -pedantic -O3 -flto

FLAC Audio Encoding

WAV To FLAC

OpenBenchmarking.orgSeconds, Fewer Is BetterFLAC Audio Encoding 1.3.1WAV To FLACNVIDIA Tegra X1MXQ PRO+ DebianMini MXIII GCC7Mini MXIII GCC5Mini MXIII GCC5 unrolledMini MXIII GCC7 unrolled4080120160200SE +/- 0.04, N = 5SE +/- 0.61, N = 5SE +/- 0.10, N = 5SE +/- 0.94, N = 5SE +/- 0.09, N = 5SE +/- 0.89, N = 540.00187.93153.48158.16162.45163.56-O2-mcpu=cortex-a53 -O3 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -flto -ffat-lto-objects -ftree-vectorize -fuse-linker-plugin-O3 -mcpu=cortex-a53 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc-O3 -mcpu=cortex-a53 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc-Ofast -mcpu=thunderx -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -ftree-vectorize -funroll-loops -ftree-loop-ivcanon -fivopts -logg-Ofast -mcpu=thunderx -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -ftree-vectorize -funroll-loops -ftree-loop-ivcanon -fivopts -logg1. (CXX) g++ options: -fvisibility=hidden -lm

FFmpeg

H.264 HD To NTSC DV

OpenBenchmarking.orgSeconds, Fewer Is BetterFFmpeg 2.6.2H.264 HD To NTSC DVNVIDIA Tegra X1MXQ PRO+ DebianMini MXIII GCC7Mini MXIII GCC5Mini MXIII GCC5 unrolledMini MXIII GCC7 unrolled50100150200250SE +/- 0.18, N = 3SE +/- 1.33, N = 3SE +/- 0.90, N = 3SE +/- 0.92, N = 3SE +/- 1.57, N = 3SE +/- 0.97, N = 374.91222.78191.79195.16151.28150.39-mcpu=cortex-a53 -fipa-pta -march=armv8-a+crc -flto -ffat-lto-objects -ftree-vectorize -fuse-linker-plugin-lxcb -lxcb-shm -lX11 -mcpu=cortex-a53 -fipa-pta -march=armv8-a+crc-lxcb -lxcb-shm -lX11 -mcpu=cortex-a53 -fipa-pta -march=armv8-a+crc-lXv -lX11 -lXext -lxcb -lxcb-shm -lxcb-xfixes -lxcb-render -lxcb-shape -lasound -lSDL -llzma -lbz2 -Ofast -mcpu=thunderx -fipa-pta -march=armv8-a+crc -ftree-vectorize -funroll-loops -ftree-loop-ivcanon -fivopts-lXv -lX11 -lXext -lxcb -lxcb-shm -lxcb-xfixes -lxcb-render -lxcb-shape -lasound -lSDL -llzma -lbz2 -Ofast -mcpu=thunderx -fipa-pta -march=armv8-a+crc -ftree-vectorize -funroll-loops -ftree-loop-ivcanon -fivopts1. (CC) gcc options: -lavdevice -lavfilter -lavformat -lavcodec -lswresample -lswscale -lavutil -lm -pthread -std=c99 -fomit-frame-pointer -O3 -fno-math-errno -fno-signed-zeros -fno-tree-vectorize -MMD -MF -MT

N-Queens

Elapsed Time

OpenBenchmarking.orgSeconds, Fewer Is BetterN-Queens 1.0Elapsed TimeNVIDIA Tegra X1MXQ PRO+ DebianMini MXIII GCC7Mini MXIII GCC5Mini MXIII GCC5 unrolledMini MXIII GCC7 unrolled4080120160200SE +/- 0.00, N = 3SE +/- 0.03, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3109.64164.23156.09152.03140.45154.18-mcpu=cortex-a53 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -ftree-vectorize-mcpu=cortex-a53 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc-mcpu=cortex-a53 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc-Ofast -mcpu=thunderx -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -ftree-vectorize -funroll-loops -ftree-loop-ivcanon -fivopts-Ofast -mcpu=thunderx -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -ftree-vectorize -funroll-loops -ftree-loop-ivcanon -fivopts1. (CC) gcc options: -static -fopenmp -O3

PostgreSQL pgbench

Scaling: Buffer Test - Test: Normal Load - Mode: Read Write

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 9.4.3Scaling: Buffer Test - Test: Normal Load - Mode: Read WriteNVIDIA Tegra X1MXQ PRO+ DebianMini MXIII GCC5Mini MXIII GCC7Mini MXIII GCC7 unrolledMini MXIII GCC5 unrolled140280420560700SE +/- 29.11, N = 6SE +/- 11.16, N = 3SE +/- 72.37, N = 6SE +/- 74.34, N = 6SE +/- 67.95, N = 6SE +/- 64.22, N = 6510.88652.15626.65598.25619.53543.32-O2-mcpu=cortex-a53 -O3 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -flto -ffat-lto-objects -ftree-vectorize -fuse-linker-plugin-O3 -mcpu=cortex-a53 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc-O3 -mcpu=cortex-a53 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc-Ofast -mcpu=thunderx -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -ftree-vectorize -funroll-loops -ftree-loop-ivcanon -fivopts-Ofast -mcpu=thunderx -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -ftree-vectorize -funroll-loops -ftree-loop-ivcanon -fivopts1. (CC) gcc options: -fno-strict-aliasing -fwrapv -pthread -lpgcommon -lpgport -lpq -lpthread -lrt -lcrypt -ldl -lm

PostgreSQL pgbench

Scaling: Buffer Test - Test: Single Thread - Mode: Read Write

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 9.4.3Scaling: Buffer Test - Test: Single Thread - Mode: Read WriteNVIDIA Tegra X1MXQ PRO+ DebianMini MXIII GCC5Mini MXIII GCC7Mini MXIII GCC7 unrolledMini MXIII GCC5 unrolled60120180240300SE +/- 5.00, N = 6SE +/- 0.39, N = 3SE +/- 7.30, N = 6SE +/- 6.47, N = 6SE +/- 6.85, N = 6SE +/- 5.08, N = 6223.82281.37210.77144.95194.57202.28-O2-mcpu=cortex-a53 -O3 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -flto -ffat-lto-objects -ftree-vectorize -fuse-linker-plugin-O3 -mcpu=cortex-a53 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc-O3 -mcpu=cortex-a53 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc-Ofast -mcpu=thunderx -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -ftree-vectorize -funroll-loops -ftree-loop-ivcanon -fivopts-Ofast -mcpu=thunderx -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -ftree-vectorize -funroll-loops -ftree-loop-ivcanon -fivopts1. (CC) gcc options: -fno-strict-aliasing -fwrapv -pthread -lpgcommon -lpgport -lpq -lpthread -lrt -lcrypt -ldl -lm

Redis

Test: GET

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 3.0.1Test: GETNVIDIA Tegra X1MXQ PRO+ DebianMini MXIII GCC7Mini MXIII GCC5Mini MXIII GCC7 unrolledMini MXIII GCC5 unrolled130K260K390K520K650KSE +/- 5219.38, N = 3SE +/- 3753.59, N = 3SE +/- 1308.37, N = 3SE +/- 1411.09, N = 3SE +/- 1872.56, N = 3SE +/- 2458.15, N = 3615979.06247721.10264703.30270455.57272703.28256786.90-std=gnu99 -pipe -g3 -O3 -funroll-loops-std=gnu99 -pipe -g3 -O3 -funroll-loops -mcpu=cortex-a53 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -flto -ffat-lto-objects -ftree-vectorize -fuse-linker-plugin-O2 -O3 -mcpu=cortex-a53 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc-O2 -O3 -mcpu=cortex-a53 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc-O2 -Ofast -mcpu=thunderx -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -ftree-vectorize -funroll-loops -ftree-loop-ivcanon -fivopts-O2 -Ofast -mcpu=thunderx -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -ftree-vectorize -funroll-loops -ftree-loop-ivcanon -fivopts1. (CC) gcc options: -ggdb -rdynamic -lm -pthread -ldl

Redis

Test: SET

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 3.0.1Test: SETNVIDIA Tegra X1MXQ PRO+ DebianMini MXIII GCC7Mini MXIII GCC5Mini MXIII GCC7 unrolledMini MXIII GCC5 unrolled90K180K270K360K450KSE +/- 3893.31, N = 3SE +/- 559.40, N = 3SE +/- 826.93, N = 3SE +/- 1485.31, N = 3SE +/- 594.80, N = 3SE +/- 957.56, N = 3432160.48186038.34194408.30197144.29200337.42197794.05-std=gnu99 -pipe -g3 -O3 -funroll-loops-std=gnu99 -pipe -g3 -O3 -funroll-loops -mcpu=cortex-a53 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -flto -ffat-lto-objects -ftree-vectorize -fuse-linker-plugin-O2 -O3 -mcpu=cortex-a53 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc-O2 -O3 -mcpu=cortex-a53 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc-O2 -Ofast -mcpu=thunderx -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -ftree-vectorize -funroll-loops -ftree-loop-ivcanon -fivopts-O2 -Ofast -mcpu=thunderx -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -ftree-vectorize -funroll-loops -ftree-loop-ivcanon -fivopts1. (CC) gcc options: -ggdb -rdynamic -lm -pthread -ldl


Phoronix Test Suite v10.8.4