AARCH64 codegen comparison update

gcc7's performance on Cortex A53 (32kB L1)

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 1703238-RI-GCCLATEST31
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results

Limit displaying results to tests within:

C/C++ Compiler Tests 5 Tests
CPU Massive 9 Tests
Creator Workloads 4 Tests
HPC - High Performance Computing 2 Tests
Common Kernel Benchmarks 2 Tests
Multi-Core 5 Tests
Raytracing 2 Tests
Renderers 4 Tests
Scientific Computing 2 Tests
Server 2 Tests
Server CPU Tests 4 Tests
Single-Threaded 4 Tests

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Geometric Means Per-Suite/Category
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Prefer Vertical Bar Graphs

Additional Graphs

Show Perf Per Clock Calculation Graphs Where Applicable

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
A53 vectorize, pre-patch
January 11 2017
 
thunderx/vectorize, pre-patch
January 10 2017
 
A53 vectorize/LTO, pre patch
January 11 2017
 
A53, post patch
January 14 2017
 
A53 mtune/vectorize, post-patch
January 14 2017
 
A53 vectorize, updated
February 15 2017
 
A53 vectorize, earlier build
February 25 2017
 
A57 vectorize/unrolled GCC 7.0.1
March 22 2017
 
A53 vectorize GCC 7.0.1
March 23 2017
 
A57 vectorize/unrolled GCC 6.3
March 22 2017
 
Invert Hiding All Results Option
 

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


AARCH64 codegen comparison update - Phoronix Test Suite

AARCH64 codegen comparison update

gcc7's performance on Cortex A53 (32kB L1)

HTML result view exported from: https://openbenchmarking.org/result/1703238-RI-GCCLATEST31&rdt&grs&export=txt.

AARCH64 codegen comparison updateProcessorMotherboardMemoryDiskOSKernelCompilerFile-SystemScreen Resolutionthunderx/vectorize, pre-patchA53 vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 6.3A57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1AArch64 rev 4 @ 1.50GHz (4 Cores)Amlogic2048MB32GB 00000 + 16GB NCardUbuntu 16.043.14.29 (aarch64)GCC 7.0.0 20170110 + Clang 3.8.0-2ubuntu4 + LLVM 3.8.0ext41920x3240AArch64 rev 4 @ 1.55GHz (4 Cores)GCC 7.0.0 20170113 + Clang 3.8.0-2ubuntu4 + LLVM 3.8.0Unknown @ 1.54GHz (4 Cores)16GB NCard + 32GB 000003.14.79-vegas95 (aarch64)GCC 7.0.1 20170214 + Clang 3.8.0-2ubuntu4 + LLVM 3.8.01280x1440GCC 7.0.1 20170220 + Clang 3.8.0-2ubuntu4 + LLVM 3.8.08GB NCard + 32GB 00000GCC 6.3.1 20170316 + Clang 3.8.0-2ubuntu4 + LLVM 3.8.0GCC 7.0.1 20170322 + Clang 3.8.0-2ubuntu4 + LLVM 3.8.0OpenBenchmarking.orgCompiler Details- thunderx/vectorize, pre-patch: --build=aarch64-linux-gnu --disable-browser-plugin --disable-libquadmath --disable-werror --enable-checking=release --enable-clocale=gnu --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,c++,fortran --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --target=aarch64-linux-gnu --with-arch-directory=aarch64 --with-default-libstdcxx-abi=new- A53 vectorize, pre-patch: --build=aarch64-linux-gnu --disable-browser-plugin --disable-libquadmath --disable-werror --enable-checking=release --enable-clocale=gnu --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,c++,fortran --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --target=aarch64-linux-gnu --with-arch-directory=aarch64 --with-default-libstdcxx-abi=new- A53 vectorize/LTO, pre patch: --build=aarch64-linux-gnu --disable-browser-plugin --disable-libquadmath --disable-werror --enable-checking=release --enable-clocale=gnu --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,c++,fortran --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --target=aarch64-linux-gnu --with-arch-directory=aarch64 --with-default-libstdcxx-abi=new- A53, post patch: --build=aarch64-linux-gnu --disable-browser-plugin --disable-libquadmath --disable-werror --enable-checking=release --enable-clocale=gnu --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,c++,fortran --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --target=aarch64-linux-gnu --with-arch-directory=aarch64 --with-default-libstdcxx-abi=new- A53 mtune/vectorize, post-patch: --build=aarch64-linux-gnu --disable-browser-plugin --disable-libquadmath --disable-werror --enable-checking=release --enable-clocale=gnu --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,c++,fortran --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --target=aarch64-linux-gnu --with-arch-directory=aarch64 --with-default-libstdcxx-abi=new- A53 vectorize, updated: --build=aarch64-linux-gnu --disable-browser-plugin --disable-libquadmath --disable-werror --enable-checking=release --enable-clocale=gnu --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,c++,fortran --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --target=aarch64-linux-gnu --with-arch-directory=aarch64 --with-default-libstdcxx-abi=new- A53 vectorize, earlier build: --build=aarch64-linux-gnu --disable-browser-plugin --disable-libquadmath --disable-werror --enable-checking=release --enable-clocale=gnu --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,c++,fortran --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --target=aarch64-linux-gnu --with-arch-directory=aarch64 --with-default-libstdcxx-abi=new- A57 vectorize/unrolled GCC 6.3: --build=aarch64-linux-gnu --disable-bootstrap --disable-browser-plugin --disable-libquadmath --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-languages=c,c++,fortran --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --target=aarch64-linux-gnu --with-arch-directory=aarch64 --with-default-libstdcxx-abi=new- A57 vectorize/unrolled GCC 7.0.1: --build=aarch64-linux-gnu --disable-bootstrap --disable-browser-plugin --disable-libquadmath --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-languages=c,c++,fortran --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --target=aarch64-linux-gnu --with-arch-directory=aarch64 --with-default-libstdcxx-abi=new- A53 vectorize GCC 7.0.1: --build=aarch64-linux-gnu --disable-bootstrap --disable-browser-plugin --disable-libquadmath --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-languages=c,c++,fortran --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --target=aarch64-linux-gnu --with-arch-directory=aarch64 --with-default-libstdcxx-abi=newDisk Details- thunderx/vectorize, pre-patch: CFQ / commit=30,errors=remount-ro,noatime,nodiratime,rw- A53 vectorize, pre-patch: CFQ / commit=30,errors=remount-ro,noatime,nodiratime,rw- A53 vectorize/LTO, pre patch: CFQ / commit=30,errors=remount-ro,noatime,nodiratime,rw- A53, post patch: CFQ / commit=30,errors=remount-ro,noatime,nodiratime,rw- A53 mtune/vectorize, post-patch: CFQ / commit=30,errors=remount-ro,noatime,nodiratime,rw- A53 vectorize, updated: DEADLINE / commit=45,errors=remount-ro,noatime,nodiratime,rw- A53 vectorize, earlier build: CFQ / commit=45,errors=remount-ro,noatime,nodiratime,rw- A57 vectorize/unrolled GCC 6.3: CFQ / commit=120,errors=remount-ro,noatime,nodiratime,rw- A57 vectorize/unrolled GCC 7.0.1: CFQ / commit=120,errors=remount-ro,noatime,nodiratime,rw- A53 vectorize GCC 7.0.1: CFQ / commit=120,errors=remount-ro,noatime,nodiratime,rwProcessor Details- thunderx/vectorize, pre-patch: Scaling Governor: meson_cpufreq performance- A53 vectorize, pre-patch: Scaling Governor: meson_cpufreq performance- A53 vectorize/LTO, pre patch: Scaling Governor: meson_cpufreq performance- A53, post patch: Scaling Governor: meson_cpufreq performance- A53 mtune/vectorize, post-patch: Scaling Governor: meson_cpufreq interactive- A53 vectorize, updated: Scaling Governor: meson_cpufreq performance- A53 vectorize, earlier build: Scaling Governor: meson_cpufreq performance- A57 vectorize/unrolled GCC 6.3: Scaling Governor: meson_cpufreq performance- A57 vectorize/unrolled GCC 7.0.1: Scaling Governor: meson_cpufreq performance- A53 vectorize GCC 7.0.1: Scaling Governor: meson_cpufreq performance

AARCH64 codegen comparison updateramspeed: Copy - Floating Pointramspeed: Copy - Integerc-ray: Total Timefftw: Stock - 2D FFT Size 2048postmark: Disk Transaction Performanceredis: GETmafft: Multiple Sequence Alignmentprimesieve: 1e12 Prime Number Generationtachyon: Total Timettsiod-renderer: Phong Rendering With Soft-Shadow Mappingfhourstones: Complex Connect-4 Solvingsmallpt: Global Illumination Renderer; 100 Samplesopenssl: RSA 4096-bit Performancesudokut: Total Timegmpbench: Total Timethunderx/vectorize, pre-patchA53 vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 6.3A57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.12817.452821.43149.82190.631351318926.0234.46566.2171.4123.013210.2016721.50102.75554.834580.394581.32187.97196.901363310344.7335.42543.1669.2723.163212.1016721.50101.95552.844825.134829.91184.81180.531378311785.0233.16540.9567.6423.773213.7716821.50101.75554.374964.664965.06186.69186.211381309030.6433.06553.1369.4023.473209.6716821.50101.88552.564965.604955.97186.61184.811378313438.9132.17573.1369.3423.493205.4016721.50102.17555.104785.594706.40161.80185.151217277268.2333.90574.6569.9023.293223.5716621.40102.72554.114816.714816.69162.23191.541211283742.8334.52523.4369.3923.493233.7716621.40102.59554.214388.734384.53149.61157.971190275458.7035.47531.9572.2822.293325.6016921.20103.00554.034193.974201.53154.81173.031184276169.8634.22525.0023.563398.4716821.47103.04553.174188.184161.09151.02156.721194276298.4435.52547.1269.6423.133415.0716621.47102.97552.75OpenBenchmarking.org

RAMspeed SMP

Type: Copy - Benchmark: Floating Point

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Copy - Benchmark: Floating Pointthunderx/vectorize, pre-patchA53 vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 6.3A57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1110022003300440055002817.454580.394825.134964.664965.604785.594816.714388.734193.974188.18

RAMspeed SMP

Type: Copy - Benchmark: Integer

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Copy - Benchmark: Integerthunderx/vectorize, pre-patchA53 vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 6.3A57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1110022003300440055002821.434581.324829.914965.064955.974706.404816.694384.534201.534161.09

C-Ray

Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total Timethunderx/vectorize, pre-patchA53 vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 6.3A57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.14080120160200SE +/- 1.37, N = 3SE +/- 0.69, N = 3SE +/- 0.17, N = 3SE +/- 0.14, N = 3SE +/- 0.12, N = 3SE +/- 0.27, N = 3SE +/- 1.00, N = 3SE +/- 1.47, N = 3SE +/- 0.78, N = 3SE +/- 0.02, N = 3149.82187.97184.81186.69186.61161.80162.23149.61154.81151.02-Ofast -mcpu=thunderx -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-mcpu=cortex-a53 -ftree-vectorize -flto -ffat-lto-objects-Ofast -mcpu=cortex-a53-Ofast -mtune=cortex-a53 -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mtune=cortex-a57 -ftree-vectorize -funroll-loops-Ofast -mtune=cortex-a57 -ftree-vectorize -funroll-loops-Ofast -mcpu=cortex-a53 -ftree-vectorize1. (CC) gcc options: -lm -lpthread -O3 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc

FFTW

Build: Stock - Size: 2D FFT Size 2048

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.4Build: Stock - Size: 2D FFT Size 2048thunderx/vectorize, pre-patchA53 vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 6.3A57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.14080120160200SE +/- 1.10, N = 5SE +/- 0.99, N = 5SE +/- 0.49, N = 5SE +/- 0.08, N = 5SE +/- 0.21, N = 5SE +/- 0.06, N = 5SE +/- 0.16, N = 5SE +/- 0.26, N = 5SE +/- 0.10, N = 5SE +/- 0.16, N = 5190.63196.90180.53186.21184.81185.15191.54157.97173.03156.72-Ofast -mcpu=thunderx -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-O3 -mcpu=cortex-a53 -ftree-vectorize -flto -ffat-lto-objects-Ofast -mcpu=cortex-a53-Ofast -mtune=cortex-a53 -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mtune=cortex-a57 -ftree-vectorize -funroll-loops-Ofast -mtune=cortex-a57 -ftree-vectorize -funroll-loops-Ofast -mcpu=cortex-a53 -ftree-vectorize1. (CC) gcc options: -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -lm

PostMark

Disk Transaction Performance

OpenBenchmarking.orgTPS, More Is BetterPostMark 1.51Disk Transaction Performancethunderx/vectorize, pre-patchA53 vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 6.3A57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.130060090012001500SE +/- 0.00, N = 3SE +/- 2.67, N = 3SE +/- 2.67, N = 3SE +/- 4.33, N = 3SE +/- 5.00, N = 3SE +/- 2.00, N = 3SE +/- 5.29, N = 3SE +/- 3.67, N = 313511363137813811378121712111190118411941. (CC) gcc options: -O3

Redis

Test: GET

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 3.0.1Test: GETthunderx/vectorize, pre-patchA53 vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 6.3A57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.170K140K210K280K350KSE +/- 2784.59, N = 3SE +/- 4662.92, N = 6SE +/- 2239.53, N = 3SE +/- 1052.91, N = 3SE +/- 1967.34, N = 3SE +/- 2017.17, N = 3SE +/- 419.32, N = 3SE +/- 3267.58, N = 3SE +/- 649.43, N = 3SE +/- 2031.32, N = 3318926.02310344.73311785.02309030.64313438.91277268.23283742.83275458.70276169.86276298.44-Ofast -mcpu=thunderx -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-O3 -mcpu=cortex-a53 -ftree-vectorize -flto -ffat-lto-objects-Ofast -mcpu=cortex-a53-Ofast -mtune=cortex-a53 -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mtune=cortex-a57 -ftree-vectorize -funroll-loops-Ofast -mtune=cortex-a57 -ftree-vectorize -funroll-loops-Ofast -mcpu=cortex-a53 -ftree-vectorize1. (CC) gcc options: -ggdb -rdynamic -lm -pthread -ldl -O2 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc

Timed MAFFT Alignment

Multiple Sequence Alignment

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed MAFFT Alignment 6.864Multiple Sequence Alignmentthunderx/vectorize, pre-patchA53 vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 6.3A57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1816243240SE +/- 0.73, N = 6SE +/- 0.80, N = 6SE +/- 0.70, N = 6SE +/- 0.71, N = 6SE +/- 0.01, N = 3SE +/- 0.79, N = 6SE +/- 1.04, N = 6SE +/- 0.08, N = 3SE +/- 0.71, N = 6SE +/- 0.97, N = 634.4635.4233.1633.0632.1733.9034.5235.4734.2235.521. (CC) gcc options: -O3 -lm -lpthread

Primesieve

1e12 Prime Number Generation

OpenBenchmarking.orgSeconds, Fewer Is BetterPrimesieve 5.4.21e12 Prime Number Generationthunderx/vectorize, pre-patchA53 vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 6.3A57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1120240360480600SE +/- 2.99, N = 3SE +/- 3.01, N = 3SE +/- 8.42, N = 3SE +/- 9.14, N = 3SE +/- 6.92, N = 3SE +/- 9.13, N = 4SE +/- 4.17, N = 3SE +/- 1.80, N = 3SE +/- 3.16, N = 3SE +/- 9.38, N = 3566.21543.16540.95553.13573.13574.65523.43531.95525.00547.12-Ofast -mcpu=thunderx -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-O3 -mcpu=cortex-a53 -ftree-vectorize -flto -ffat-lto-objects-Ofast -mcpu=cortex-a53-Ofast -mtune=cortex-a53 -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mtune=cortex-a57 -ftree-vectorize -funroll-loops-Ofast -mtune=cortex-a57 -ftree-vectorize -funroll-loops-Ofast -mcpu=cortex-a53 -ftree-vectorize1. (CXX) g++ options: -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -fopenmp

Tachyon

Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterTachyon 0.98.9Total Timethunderx/vectorize, pre-patchA53 vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 6.3A53 vectorize GCC 7.0.11632486480SE +/- 0.06, N = 3SE +/- 0.08, N = 3SE +/- 0.11, N = 3SE +/- 0.12, N = 3SE +/- 0.10, N = 3SE +/- 0.29, N = 3SE +/- 0.22, N = 3SE +/- 0.03, N = 3SE +/- 0.17, N = 371.4169.2767.6469.4069.3469.9069.3972.2869.64

TTSIOD 3D Renderer

Phong Rendering With Soft-Shadow Mapping

OpenBenchmarking.orgFPS, More Is BetterTTSIOD 3D Renderer 2.3aPhong Rendering With Soft-Shadow Mappingthunderx/vectorize, pre-patchA53 vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 6.3A57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1612182430SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.09, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 323.0123.1623.7723.4723.4923.2923.4922.2923.5623.13-Ofast -mcpu=thunderx -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-O3 -mcpu=cortex-a53 -ftree-vectorize -ffat-lto-objects-Ofast -mcpu=cortex-a53-Ofast -mtune=cortex-a53 -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mtune=cortex-a57 -ftree-vectorize -funroll-loops-Ofast -mtune=cortex-a57 -ftree-vectorize -funroll-loops-Ofast -mcpu=cortex-a53 -ftree-vectorize1. (CXX) g++ options: -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -ffast-math -mtune=native -flto -lSDL -lstdc++

Fhourstones

Complex Connect-4 Solving

OpenBenchmarking.orgKpos / sec, More Is BetterFhourstones 3.1Complex Connect-4 Solvingthunderx/vectorize, pre-patchA53 vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 6.3A57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.17001400210028003500SE +/- 0.76, N = 3SE +/- 0.35, N = 3SE +/- 0.22, N = 3SE +/- 1.47, N = 3SE +/- 1.81, N = 3SE +/- 3.32, N = 3SE +/- 1.49, N = 3SE +/- 1.97, N = 3SE +/- 1.93, N = 3SE +/- 2.42, N = 33210.203212.103213.773209.673205.403223.573233.773325.603398.473415.071. (CC) gcc options: -O3

Smallpt

Global Illumination Renderer; 100 Samples

OpenBenchmarking.orgSeconds, Fewer Is BetterSmallpt 1.0Global Illumination Renderer; 100 Samplesthunderx/vectorize, pre-patchA53 vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 6.3A57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.14080120160200SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.33, N = 3SE +/- 0.33, N = 3SE +/- 2.33, N = 3167167168168167166166169168166-Ofast -mcpu=thunderx -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-O3 -mcpu=cortex-a53 -ftree-vectorize -flto -ffat-lto-objects-Ofast -mcpu=cortex-a53-Ofast -mtune=cortex-a53 -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mtune=cortex-a57 -ftree-vectorize -funroll-loops-Ofast -mtune=cortex-a57 -ftree-vectorize -funroll-loops-Ofast -mcpu=cortex-a53 -ftree-vectorize1. (CXX) g++ options: -fopenmp -fomit-frame-pointer -fipa-pta -march=armv8-a+crc

OpenSSL

RSA 4096-bit Performance

OpenBenchmarking.orgSigns Per Second, More Is BetterOpenSSL 1.0.1gRSA 4096-bit Performancethunderx/vectorize, pre-patchA53 vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 6.3A57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1510152025SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.03, N = 3SE +/- 0.03, N = 321.5021.5021.5021.5021.5021.4021.4021.2021.4721.471. (CC) gcc options: -O3 -fomit-frame-pointer -lssl -lcrypto -ldl

Sudokut

Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterSudokut 0.4Total Timethunderx/vectorize, pre-patchA53 vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 6.3A57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.120406080100SE +/- 0.76, N = 3SE +/- 0.20, N = 3SE +/- 0.21, N = 3SE +/- 0.10, N = 3SE +/- 0.09, N = 3SE +/- 0.20, N = 3SE +/- 0.06, N = 3SE +/- 0.04, N = 3SE +/- 0.04, N = 3SE +/- 0.13, N = 3102.75101.95101.75101.88102.17102.72102.59103.00103.04102.97

GMPbench

Total Time

OpenBenchmarking.orgGMPbench Score, More Is BetterGMPbench 0.2Total Timethunderx/vectorize, pre-patchA53 vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchA53 vectorize, updatedA53 vectorize, earlier buildA57 vectorize/unrolled GCC 6.3A57 vectorize/unrolled GCC 7.0.1A53 vectorize GCC 7.0.1120240360480600554.83552.84554.37552.56555.10554.11554.21554.03553.17552.75-Ofast -mcpu=thunderx -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-O3 -mcpu=cortex-a53 -ftree-vectorize -flto -ffat-lto-objects-Ofast -mcpu=cortex-a53-Ofast -mtune=cortex-a53 -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mtune=cortex-a57 -ftree-vectorize -funroll-loops-Ofast -mtune=cortex-a57 -ftree-vectorize -funroll-loops-Ofast -mcpu=cortex-a53 -ftree-vectorize1. (CC) gcc options: -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -lm


Phoronix Test Suite v10.8.4