AARCH64 codegen comparison update

gcc7's performance on Cortex A53 (32kB L1)

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 1703238-RI-GCCLATEST31
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results

Limit displaying results to tests within:

C/C++ Compiler Tests 5 Tests
CPU Massive 9 Tests
Creator Workloads 4 Tests
HPC - High Performance Computing 2 Tests
Common Kernel Benchmarks 2 Tests
Multi-Core 5 Tests
Raytracing 2 Tests
Renderers 4 Tests
Scientific Computing 2 Tests
Server 2 Tests
Server CPU Tests 4 Tests
Single-Threaded 4 Tests

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Geometric Means Per-Suite/Category
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Prefer Vertical Bar Graphs

Additional Graphs

Show Perf Per Clock Calculation Graphs Where Applicable

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
A53 vectorize, pre-patch
January 11 2017
 
thunderx/vectorize, pre-patch
January 10 2017
 
A53 vectorize/LTO, pre patch
January 11 2017
 
A53, post patch
January 14 2017
 
A53 mtune/vectorize, post-patch
January 14 2017
 
A53 vectorize, updated
February 15 2017
 
A53 vectorize, earlier build
February 25 2017
 
A57 vectorize/unrolled GCC 7.0.1
March 22 2017
 
A53 vectorize GCC 7.0.1
March 23 2017
 
A57 vectorize/unrolled GCC 6.3
March 22 2017
 
Invert Hiding All Results Option
 

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


AARCH64 codegen comparison updateProcessorMotherboardMemoryDiskOSKernelCompilerFile-SystemScreen ResolutionA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1A57 vectorize/unrolled GCC 6.3AArch64 rev 4 @ 1.50GHz (4 Cores)Amlogic2048MB32GB 00000 + 16GB NCardUbuntu 16.043.14.29 (aarch64)GCC 7.0.0 20170110 + Clang 3.8.0-2ubuntu4 + LLVM 3.8.0ext41920x3240AArch64 rev 4 @ 1.55GHz (4 Cores)GCC 7.0.0 20170113 + Clang 3.8.0-2ubuntu4 + LLVM 3.8.0Unknown @ 1.54GHz (4 Cores)16GB NCard + 32GB 000003.14.79-vegas95 (aarch64)GCC 7.0.1 20170214 + Clang 3.8.0-2ubuntu4 + LLVM 3.8.01280x1440GCC 7.0.1 20170220 + Clang 3.8.0-2ubuntu4 + LLVM 3.8.08GB NCard + 32GB 00000GCC 7.0.1 20170322 + Clang 3.8.0-2ubuntu4 + LLVM 3.8.0GCC 6.3.1 20170316 + Clang 3.8.0-2ubuntu4 + LLVM 3.8.0OpenBenchmarking.orgCompiler Details- A53 vectorize, pre-patch: --build=aarch64-linux-gnu --disable-browser-plugin --disable-libquadmath --disable-werror --enable-checking=release --enable-clocale=gnu --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,c++,fortran --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --target=aarch64-linux-gnu --with-arch-directory=aarch64 --with-default-libstdcxx-abi=new- thunderx/vectorize, pre-patch: --build=aarch64-linux-gnu --disable-browser-plugin --disable-libquadmath --disable-werror --enable-checking=release --enable-clocale=gnu --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,c++,fortran --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --target=aarch64-linux-gnu --with-arch-directory=aarch64 --with-default-libstdcxx-abi=new- A53 vectorize/LTO, pre patch: --build=aarch64-linux-gnu --disable-browser-plugin --disable-libquadmath --disable-werror --enable-checking=release --enable-clocale=gnu --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,c++,fortran --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --target=aarch64-linux-gnu --with-arch-directory=aarch64 --with-default-libstdcxx-abi=new- A53, post patch: --build=aarch64-linux-gnu --disable-browser-plugin --disable-libquadmath --disable-werror --enable-checking=release --enable-clocale=gnu --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,c++,fortran --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --target=aarch64-linux-gnu --with-arch-directory=aarch64 --with-default-libstdcxx-abi=new- A53 mtune/vectorize, post-patch: --build=aarch64-linux-gnu --disable-browser-plugin --disable-libquadmath --disable-werror --enable-checking=release --enable-clocale=gnu --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,c++,fortran --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --target=aarch64-linux-gnu --with-arch-directory=aarch64 --with-default-libstdcxx-abi=new- A53 vectorize, updated: --build=aarch64-linux-gnu --disable-browser-plugin --disable-libquadmath --disable-werror --enable-checking=release --enable-clocale=gnu --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,c++,fortran --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --target=aarch64-linux-gnu --with-arch-directory=aarch64 --with-default-libstdcxx-abi=new- A53 vectorize, earlier build: --build=aarch64-linux-gnu --disable-browser-plugin --disable-libquadmath --disable-werror --enable-checking=release --enable-clocale=gnu --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,c++,fortran --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --target=aarch64-linux-gnu --with-arch-directory=aarch64 --with-default-libstdcxx-abi=new- A57 vectorize/unrolled GCC 7.0.1: --build=aarch64-linux-gnu --disable-bootstrap --disable-browser-plugin --disable-libquadmath --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-languages=c,c++,fortran --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --target=aarch64-linux-gnu --with-arch-directory=aarch64 --with-default-libstdcxx-abi=new- A53 vectorize GCC 7.0.1: --build=aarch64-linux-gnu --disable-bootstrap --disable-browser-plugin --disable-libquadmath --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-languages=c,c++,fortran --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --target=aarch64-linux-gnu --with-arch-directory=aarch64 --with-default-libstdcxx-abi=new- A57 vectorize/unrolled GCC 6.3: --build=aarch64-linux-gnu --disable-bootstrap --disable-browser-plugin --disable-libquadmath --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-languages=c,c++,fortran --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --target=aarch64-linux-gnu --with-arch-directory=aarch64 --with-default-libstdcxx-abi=newDisk Details- A53 vectorize, pre-patch: CFQ / commit=30,errors=remount-ro,noatime,nodiratime,rw- thunderx/vectorize, pre-patch: CFQ / commit=30,errors=remount-ro,noatime,nodiratime,rw- A53 vectorize/LTO, pre patch: CFQ / commit=30,errors=remount-ro,noatime,nodiratime,rw- A53, post patch: CFQ / commit=30,errors=remount-ro,noatime,nodiratime,rw- A53 mtune/vectorize, post-patch: CFQ / commit=30,errors=remount-ro,noatime,nodiratime,rw- A53 vectorize, updated: DEADLINE / commit=45,errors=remount-ro,noatime,nodiratime,rw- A53 vectorize, earlier build: CFQ / commit=45,errors=remount-ro,noatime,nodiratime,rw- A57 vectorize/unrolled GCC 7.0.1: CFQ / commit=120,errors=remount-ro,noatime,nodiratime,rw- A53 vectorize GCC 7.0.1: CFQ / commit=120,errors=remount-ro,noatime,nodiratime,rw- A57 vectorize/unrolled GCC 6.3: CFQ / commit=120,errors=remount-ro,noatime,nodiratime,rwProcessor Details- A53 vectorize, pre-patch: Scaling Governor: meson_cpufreq performance- thunderx/vectorize, pre-patch: Scaling Governor: meson_cpufreq performance- A53 vectorize/LTO, pre patch: Scaling Governor: meson_cpufreq performance- A53, post patch: Scaling Governor: meson_cpufreq performance- A53 mtune/vectorize, post-patch: Scaling Governor: meson_cpufreq interactive- A53 vectorize, updated: Scaling Governor: meson_cpufreq performance- A53 vectorize, earlier build: Scaling Governor: meson_cpufreq performance- A57 vectorize/unrolled GCC 7.0.1: Scaling Governor: meson_cpufreq performance- A53 vectorize GCC 7.0.1: Scaling Governor: meson_cpufreq performance- A57 vectorize/unrolled GCC 6.3: Scaling Governor: meson_cpufreq performance

A53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1A57 vectorize/unrolled GCC 6.3Result OverviewPhoronix Test Suite100%119%138%157%RAMspeed SMPFFTWC-RayPostMarkRedisTimed MAFFT AlignmentPrimesieveTTSIOD 3D RendererFhourstonesSmallptOpenSSLSudokutGMPbench

AARCH64 codegen comparison updatepostmark: Disk Transaction Performanceramspeed: Copy - Integerramspeed: Copy - Floating Pointfftw: Stock - 2D FFT Size 2048mafft: Multiple Sequence Alignmentgmpbench: Total Timefhourstones: Complex Connect-4 Solvingttsiod-renderer: Phong Rendering With Soft-Shadow Mappingc-ray: Total Timeprimesieve: 1e12 Prime Number Generationsmallpt: Global Illumination Renderer; 100 Samplessudokut: Total Timetachyon: Total Timeopenssl: RSA 4096-bit Performanceredis: GETA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1A57 vectorize/unrolled GCC 6.313634581.324580.39196.9035.42552.843212.1023.16187.97543.16167101.9569.2721.50310344.7313512821.432817.45190.6334.46554.833210.2023.01149.82566.21167102.7571.4121.50318926.0213784829.914825.13180.5333.16554.373213.7723.77184.81540.95168101.7567.6421.50311785.0213814965.064964.66186.2133.06552.563209.6723.47186.69553.13168101.8869.4021.50309030.6413784955.974965.60184.8132.17555.103205.4023.49186.61573.13167102.1769.3421.50313438.9112174706.404785.59185.1533.90554.113223.5723.29161.80574.65166102.7269.9021.40277268.2312114816.694816.71191.5434.52554.213233.7723.49162.23523.43166102.5969.3921.40283742.8311844201.534193.97173.0334.22553.173398.4723.56154.81525.00168103.0421.47276169.8611944161.094188.18156.7235.52552.753415.0723.13151.02547.12166102.9769.6421.47276298.4411904384.534388.73157.9735.47554.033325.6022.29149.61531.95169103.0072.2821.20275458.70OpenBenchmarking.org

PostMark

This is a test of NetApp's PostMark benchmark designed to simulate small-file testing similar to the tasks endured by web and mail servers. This test profile will set PostMark to perform 25,000 transactions with 500 files simultaneously with the file sizes ranging between 5 and 512 kilobytes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTPS, More Is BetterPostMark 1.51Disk Transaction PerformanceA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1A57 vectorize/unrolled GCC 6.330060090012001500SE +/- 2.67, N = 3SE +/- 0.00, N = 3SE +/- 2.67, N = 3SE +/- 4.33, N = 3SE +/- 5.00, N = 3SE +/- 2.00, N = 3SE +/- 5.29, N = 3SE +/- 3.67, N = 313631351137813811378121712111184119411901. (CC) gcc options: -O3
OpenBenchmarking.orgTPS, More Is BetterPostMark 1.51Disk Transaction PerformanceA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1A57 vectorize/unrolled GCC 6.32004006008001000Min: 1358 / Avg: 1363.33 / Max: 1366Min: 1351 / Avg: 1351 / Max: 1351Min: 1373 / Avg: 1378.33 / Max: 1381Min: 1373 / Avg: 1380.67 / Max: 1388Min: 1373 / Avg: 1378 / Max: 1388Min: 1213 / Avg: 1217 / Max: 1219Min: 1201 / Avg: 1211 / Max: 1219Min: 1190 / Avg: 1193.67 / Max: 12011. (CC) gcc options: -O3

RAMspeed SMP

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Copy - Benchmark: IntegerA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1A57 vectorize/unrolled GCC 6.3110022003300440055004581.322821.434829.914965.064955.974706.404816.694201.534161.094384.53

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Copy - Benchmark: Floating PointA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1A57 vectorize/unrolled GCC 6.3110022003300440055004580.392817.454825.134964.664965.604785.594816.714193.974188.184388.73

FFTW

FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.4Build: Stock - Size: 2D FFT Size 2048A53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1A57 vectorize/unrolled GCC 6.34080120160200SE +/- 0.99, N = 5SE +/- 1.10, N = 5SE +/- 0.49, N = 5SE +/- 0.08, N = 5SE +/- 0.21, N = 5SE +/- 0.06, N = 5SE +/- 0.16, N = 5SE +/- 0.10, N = 5SE +/- 0.16, N = 5SE +/- 0.26, N = 5196.90190.63180.53186.21184.81185.15191.54173.03156.72157.97-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mcpu=thunderx -ftree-vectorize-O3 -mcpu=cortex-a53 -ftree-vectorize -flto -ffat-lto-objects-Ofast -mcpu=cortex-a53-Ofast -mtune=cortex-a53 -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mtune=cortex-a57 -ftree-vectorize -funroll-loops-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mtune=cortex-a57 -ftree-vectorize -funroll-loops1. (CC) gcc options: -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -lm
OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.4Build: Stock - Size: 2D FFT Size 2048A53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1A57 vectorize/unrolled GCC 6.34080120160200Min: 194.64 / Avg: 196.9 / Max: 200.5Min: 188.41 / Avg: 190.63 / Max: 193.37Min: 179.17 / Avg: 180.53 / Max: 181.62Min: 185.91 / Avg: 186.21 / Max: 186.38Min: 184.26 / Avg: 184.81 / Max: 185.55Min: 184.94 / Avg: 185.15 / Max: 185.26Min: 191.17 / Avg: 191.54 / Max: 191.94Min: 172.71 / Avg: 173.03 / Max: 173.29Min: 156.13 / Avg: 156.72 / Max: 157.07Min: 157.06 / Avg: 157.97 / Max: 158.541. (CC) gcc options: -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -lm

Timed MAFFT Alignment

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed MAFFT Alignment 6.864Multiple Sequence AlignmentA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1A57 vectorize/unrolled GCC 6.3816243240SE +/- 0.80, N = 6SE +/- 0.73, N = 6SE +/- 0.70, N = 6SE +/- 0.71, N = 6SE +/- 0.01, N = 3SE +/- 0.79, N = 6SE +/- 1.04, N = 6SE +/- 0.71, N = 6SE +/- 0.97, N = 6SE +/- 0.08, N = 335.4234.4633.1633.0632.1733.9034.5234.2235.5235.471. (CC) gcc options: -O3 -lm -lpthread
OpenBenchmarking.orgSeconds, Fewer Is BetterTimed MAFFT Alignment 6.864Multiple Sequence AlignmentA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1A57 vectorize/unrolled GCC 6.3816243240Min: 32.14 / Avg: 35.42 / Max: 38.35Min: 31.91 / Avg: 34.46 / Max: 35.82Min: 31.84 / Avg: 33.16 / Max: 35.37Min: 31.61 / Avg: 33.06 / Max: 35.43Min: 32.15 / Avg: 32.17 / Max: 32.19Min: 31.77 / Avg: 33.9 / Max: 35.67Min: 32.3 / Avg: 34.52 / Max: 38.29Min: 32.63 / Avg: 34.22 / Max: 36.05Min: 32.7 / Avg: 35.52 / Max: 38.92Min: 35.37 / Avg: 35.47 / Max: 35.621. (CC) gcc options: -O3 -lm -lpthread

GMPbench

OpenBenchmarking.orgGMPbench Score, More Is BetterGMPbench 0.2Total TimeA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1A57 vectorize/unrolled GCC 6.3120240360480600552.84554.83554.37552.56555.10554.11554.21553.17552.75554.03-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mcpu=thunderx -ftree-vectorize-O3 -mcpu=cortex-a53 -ftree-vectorize -flto -ffat-lto-objects-Ofast -mcpu=cortex-a53-Ofast -mtune=cortex-a53 -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mtune=cortex-a57 -ftree-vectorize -funroll-loops-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mtune=cortex-a57 -ftree-vectorize -funroll-loops1. (CC) gcc options: -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -lm

Fhourstones

OpenBenchmarking.orgKpos / sec, More Is BetterFhourstones 3.1Complex Connect-4 SolvingA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1A57 vectorize/unrolled GCC 6.37001400210028003500SE +/- 0.35, N = 3SE +/- 0.76, N = 3SE +/- 0.22, N = 3SE +/- 1.47, N = 3SE +/- 1.81, N = 3SE +/- 3.32, N = 3SE +/- 1.49, N = 3SE +/- 1.93, N = 3SE +/- 2.42, N = 3SE +/- 1.97, N = 33212.103210.203213.773209.673205.403223.573233.773398.473415.073325.601. (CC) gcc options: -O3
OpenBenchmarking.orgKpos / sec, More Is BetterFhourstones 3.1Complex Connect-4 SolvingA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1A57 vectorize/unrolled GCC 6.36001200180024003000Min: 3211.5 / Avg: 3212.1 / Max: 3212.7Min: 3208.7 / Avg: 3210.2 / Max: 3211.2Min: 3213.5 / Avg: 3213.77 / Max: 3214.2Min: 3207.2 / Avg: 3209.67 / Max: 3212.3Min: 3203.2 / Avg: 3205.4 / Max: 3209Min: 3217 / Avg: 3223.57 / Max: 3227.7Min: 3230.8 / Avg: 3233.77 / Max: 3235.5Min: 3395.5 / Avg: 3398.47 / Max: 3402.1Min: 3410.6 / Avg: 3415.07 / Max: 3418.9Min: 3321.8 / Avg: 3325.6 / Max: 3328.41. (CC) gcc options: -O3

TTSIOD 3D Renderer

A portable GPL 3D software renderer that supports OpenMP and Intel Threading Building Blocks with many different rendering modes. This version does not use OpenGL but is entirely CPU/software based. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is BetterTTSIOD 3D Renderer 2.3aPhong Rendering With Soft-Shadow MappingA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1A57 vectorize/unrolled GCC 6.3612182430SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.09, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 323.1623.0123.7723.4723.4923.2923.4923.5623.1322.29-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mcpu=thunderx -ftree-vectorize-O3 -mcpu=cortex-a53 -ftree-vectorize -ffat-lto-objects-Ofast -mcpu=cortex-a53-Ofast -mtune=cortex-a53 -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mtune=cortex-a57 -ftree-vectorize -funroll-loops-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mtune=cortex-a57 -ftree-vectorize -funroll-loops1. (CXX) g++ options: -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -ffast-math -mtune=native -flto -lSDL -lstdc++
OpenBenchmarking.orgFPS, More Is BetterTTSIOD 3D Renderer 2.3aPhong Rendering With Soft-Shadow MappingA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1A57 vectorize/unrolled GCC 6.3612182430Min: 23.14 / Avg: 23.16 / Max: 23.16Min: 23.01 / Avg: 23.01 / Max: 23.01Min: 23.74 / Avg: 23.77 / Max: 23.78Min: 23.44 / Avg: 23.47 / Max: 23.49Min: 23.47 / Avg: 23.49 / Max: 23.51Min: 23.18 / Avg: 23.29 / Max: 23.48Min: 23.49 / Avg: 23.49 / Max: 23.5Min: 23.53 / Avg: 23.56 / Max: 23.58Min: 23.1 / Avg: 23.13 / Max: 23.15Min: 22.28 / Avg: 22.29 / Max: 22.31. (CXX) g++ options: -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -ffast-math -mtune=native -flto -lSDL -lstdc++

C-Ray

This is a test of C-Ray, a simple raytracer designed to test the floating-point CPU performance. This test is multi-threaded (16 threads per core), will shoot 8 rays per pixel for anti-aliasing, and will generate a 1600 x 1200 image. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total TimeA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1A57 vectorize/unrolled GCC 6.34080120160200SE +/- 0.69, N = 3SE +/- 1.37, N = 3SE +/- 0.17, N = 3SE +/- 0.14, N = 3SE +/- 0.12, N = 3SE +/- 0.27, N = 3SE +/- 1.00, N = 3SE +/- 0.78, N = 3SE +/- 0.02, N = 3SE +/- 1.47, N = 3187.97149.82184.81186.69186.61161.80162.23154.81151.02149.61-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mcpu=thunderx -ftree-vectorize-mcpu=cortex-a53 -ftree-vectorize -flto -ffat-lto-objects-Ofast -mcpu=cortex-a53-Ofast -mtune=cortex-a53 -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mtune=cortex-a57 -ftree-vectorize -funroll-loops-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mtune=cortex-a57 -ftree-vectorize -funroll-loops1. (CC) gcc options: -lm -lpthread -O3 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc
OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total TimeA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1A57 vectorize/unrolled GCC 6.3306090120150Min: 186.75 / Avg: 187.97 / Max: 189.13Min: 148.36 / Avg: 149.82 / Max: 152.55Min: 184.51 / Avg: 184.81 / Max: 185.1Min: 186.54 / Avg: 186.69 / Max: 186.96Min: 186.37 / Avg: 186.61 / Max: 186.72Min: 161.53 / Avg: 161.8 / Max: 162.35Min: 161.19 / Avg: 162.23 / Max: 164.23Min: 153.93 / Avg: 154.81 / Max: 156.36Min: 150.99 / Avg: 151.02 / Max: 151.05Min: 148.13 / Avg: 149.61 / Max: 152.541. (CC) gcc options: -lm -lpthread -O3 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc

Primesieve

OpenBenchmarking.orgSeconds, Fewer Is BetterPrimesieve 5.4.21e12 Prime Number GenerationA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1A57 vectorize/unrolled GCC 6.3120240360480600SE +/- 3.01, N = 3SE +/- 2.99, N = 3SE +/- 8.42, N = 3SE +/- 9.14, N = 3SE +/- 6.92, N = 3SE +/- 9.13, N = 4SE +/- 4.17, N = 3SE +/- 3.16, N = 3SE +/- 9.38, N = 3SE +/- 1.80, N = 3543.16566.21540.95553.13573.13574.65523.43525.00547.12531.95-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mcpu=thunderx -ftree-vectorize-O3 -mcpu=cortex-a53 -ftree-vectorize -flto -ffat-lto-objects-Ofast -mcpu=cortex-a53-Ofast -mtune=cortex-a53 -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mtune=cortex-a57 -ftree-vectorize -funroll-loops-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mtune=cortex-a57 -ftree-vectorize -funroll-loops1. (CXX) g++ options: -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -fopenmp
OpenBenchmarking.orgSeconds, Fewer Is BetterPrimesieve 5.4.21e12 Prime Number GenerationA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1A57 vectorize/unrolled GCC 6.3100200300400500Min: 537.24 / Avg: 543.16 / Max: 547.08Min: 561.7 / Avg: 566.21 / Max: 571.87Min: 532.14 / Avg: 540.95 / Max: 557.79Min: 536.7 / Avg: 553.13 / Max: 568.29Min: 560.03 / Avg: 573.13 / Max: 583.51Min: 557.95 / Avg: 574.65 / Max: 597.03Min: 516.73 / Avg: 523.43 / Max: 531.08Min: 521.42 / Avg: 525 / Max: 531.29Min: 535.52 / Avg: 547.12 / Max: 565.68Min: 528.37 / Avg: 531.95 / Max: 534.081. (CXX) g++ options: -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -fopenmp

Smallpt

OpenBenchmarking.orgSeconds, Fewer Is BetterSmallpt 1.0Global Illumination Renderer; 100 SamplesA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1A57 vectorize/unrolled GCC 6.34080120160200SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.33, N = 3SE +/- 0.33, N = 3SE +/- 2.33, N = 3167167168168167166166168166169-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mcpu=thunderx -ftree-vectorize-O3 -mcpu=cortex-a53 -ftree-vectorize -flto -ffat-lto-objects-Ofast -mcpu=cortex-a53-Ofast -mtune=cortex-a53 -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mtune=cortex-a57 -ftree-vectorize -funroll-loops-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mtune=cortex-a57 -ftree-vectorize -funroll-loops1. (CXX) g++ options: -fopenmp -fomit-frame-pointer -fipa-pta -march=armv8-a+crc
OpenBenchmarking.orgSeconds, Fewer Is BetterSmallpt 1.0Global Illumination Renderer; 100 SamplesA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1A57 vectorize/unrolled GCC 6.3306090120150Min: 167 / Avg: 167 / Max: 167Min: 167 / Avg: 167 / Max: 167Min: 168 / Avg: 168 / Max: 168Min: 168 / Avg: 168 / Max: 168Min: 167 / Avg: 167 / Max: 167Min: 166 / Avg: 166.33 / Max: 167Min: 166 / Avg: 166.33 / Max: 167Min: 167 / Avg: 169.33 / Max: 1741. (CXX) g++ options: -fopenmp -fomit-frame-pointer -fipa-pta -march=armv8-a+crc

Sudokut

This is a test of Sudokut, which is a Sudoku puzzle solver written in Tcl. This test measures how long it takes to solve 100 Sudoku puzzles. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterSudokut 0.4Total TimeA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1A57 vectorize/unrolled GCC 6.320406080100SE +/- 0.20, N = 3SE +/- 0.76, N = 3SE +/- 0.21, N = 3SE +/- 0.10, N = 3SE +/- 0.09, N = 3SE +/- 0.20, N = 3SE +/- 0.06, N = 3SE +/- 0.04, N = 3SE +/- 0.13, N = 3SE +/- 0.04, N = 3101.95102.75101.75101.88102.17102.72102.59103.04102.97103.00
OpenBenchmarking.orgSeconds, Fewer Is BetterSudokut 0.4Total TimeA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1A57 vectorize/unrolled GCC 6.320406080100Min: 101.73 / Avg: 101.95 / Max: 102.35Min: 101.98 / Avg: 102.75 / Max: 104.26Min: 101.51 / Avg: 101.75 / Max: 102.17Min: 101.7 / Avg: 101.88 / Max: 102.05Min: 102.06 / Avg: 102.17 / Max: 102.35Min: 102.32 / Avg: 102.72 / Max: 102.92Min: 102.49 / Avg: 102.59 / Max: 102.7Min: 102.97 / Avg: 103.04 / Max: 103.09Min: 102.71 / Avg: 102.97 / Max: 103.12Min: 102.93 / Avg: 103 / Max: 103.07

Tachyon

This is a test of the threaded Tachyon, a parallel ray-tracing system. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTachyon 0.98.9Total TimeA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA53 vectorize GCC 7.0.1A57 vectorize/unrolled GCC 6.31632486480SE +/- 0.08, N = 3SE +/- 0.06, N = 3SE +/- 0.11, N = 3SE +/- 0.12, N = 3SE +/- 0.10, N = 3SE +/- 0.29, N = 3SE +/- 0.22, N = 3SE +/- 0.17, N = 3SE +/- 0.03, N = 369.2771.4167.6469.4069.3469.9069.3969.6472.28
OpenBenchmarking.orgSeconds, Fewer Is BetterTachyon 0.98.9Total TimeA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA53 vectorize GCC 7.0.1A57 vectorize/unrolled GCC 6.31428425670Min: 69.18 / Avg: 69.27 / Max: 69.42Min: 71.32 / Avg: 71.41 / Max: 71.52Min: 67.52 / Avg: 67.64 / Max: 67.86Min: 69.2 / Avg: 69.4 / Max: 69.61Min: 69.19 / Avg: 69.34 / Max: 69.54Min: 69.38 / Avg: 69.9 / Max: 70.4Min: 69.04 / Avg: 69.39 / Max: 69.81Min: 69.36 / Avg: 69.64 / Max: 69.96Min: 72.23 / Avg: 72.28 / Max: 72.32

OpenSSL

OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test measures the RSA 4096-bit performance of OpenSSL. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSigns Per Second, More Is BetterOpenSSL 1.0.1gRSA 4096-bit PerformanceA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1A57 vectorize/unrolled GCC 6.3510152025SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.03, N = 3SE +/- 0.03, N = 3SE +/- 0.00, N = 321.5021.5021.5021.5021.5021.4021.4021.4721.4721.201. (CC) gcc options: -O3 -fomit-frame-pointer -lssl -lcrypto -ldl
OpenBenchmarking.orgSigns Per Second, More Is BetterOpenSSL 1.0.1gRSA 4096-bit PerformanceA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1A57 vectorize/unrolled GCC 6.3510152025Min: 21.5 / Avg: 21.5 / Max: 21.5Min: 21.5 / Avg: 21.5 / Max: 21.5Min: 21.5 / Avg: 21.5 / Max: 21.5Min: 21.5 / Avg: 21.5 / Max: 21.5Min: 21.5 / Avg: 21.5 / Max: 21.5Min: 21.4 / Avg: 21.4 / Max: 21.4Min: 21.4 / Avg: 21.4 / Max: 21.4Min: 21.4 / Avg: 21.47 / Max: 21.5Min: 21.4 / Avg: 21.47 / Max: 21.5Min: 21.2 / Avg: 21.2 / Max: 21.21. (CC) gcc options: -O3 -fomit-frame-pointer -lssl -lcrypto -ldl

Redis

Redis is an open-source data structure server. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 3.0.1Test: GETA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1A57 vectorize/unrolled GCC 6.370K140K210K280K350KSE +/- 4662.92, N = 6SE +/- 2784.59, N = 3SE +/- 2239.53, N = 3SE +/- 1052.91, N = 3SE +/- 1967.34, N = 3SE +/- 2017.17, N = 3SE +/- 419.32, N = 3SE +/- 649.43, N = 3SE +/- 2031.32, N = 3SE +/- 3267.58, N = 3310344.73318926.02311785.02309030.64313438.91277268.23283742.83276169.86276298.44275458.70-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mcpu=thunderx -ftree-vectorize-O3 -mcpu=cortex-a53 -ftree-vectorize -flto -ffat-lto-objects-Ofast -mcpu=cortex-a53-Ofast -mtune=cortex-a53 -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mtune=cortex-a57 -ftree-vectorize -funroll-loops-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mtune=cortex-a57 -ftree-vectorize -funroll-loops1. (CC) gcc options: -ggdb -rdynamic -lm -pthread -ldl -O2 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc
OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 3.0.1Test: GETA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1A57 vectorize/unrolled GCC 6.360K120K180K240K300KMin: 289435.59 / Avg: 310344.73 / Max: 322268.75Min: 314465.41 / Avg: 318926.02 / Max: 324044.06Min: 309310.22 / Avg: 311785.02 / Max: 316255.53Min: 307314.06 / Avg: 309030.64 / Max: 310945.28Min: 309693.41 / Avg: 313438.91 / Max: 316355.56Min: 273298.72 / Avg: 277268.23 / Max: 279876.84Min: 283045.56 / Avg: 283742.83 / Max: 284495Min: 275178.88 / Avg: 276169.86 / Max: 277392.5Min: 272628.12 / Avg: 276298.44 / Max: 279642.03Min: 269541.75 / Avg: 275458.7 / Max: 2808201. (CC) gcc options: -ggdb -rdynamic -lm -pthread -ldl -O2 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc