Cortex A53 GCC7 codegen comparison

Benchmarking the effect of d8c4c75 ARM patch

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 1701143-TA-GCCCOMPAR66
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results

Limit displaying results to tests within:

C/C++ Compiler Tests 5 Tests
CPU Massive 9 Tests
Creator Workloads 4 Tests
HPC - High Performance Computing 2 Tests
Common Kernel Benchmarks 2 Tests
Multi-Core 5 Tests
Raytracing 2 Tests
Renderers 4 Tests
Scientific Computing 2 Tests
Server 2 Tests
Server CPU Tests 4 Tests
Single-Threaded 4 Tests

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Geometric Means Per-Suite/Category
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Prefer Vertical Bar Graphs

Additional Graphs

Show Perf Per Clock Calculation Graphs Where Applicable

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
A53 vectorize, pre-patch
January 11 2017
 
thunderx/vectorize, pre-patch
January 10 2017
 
A53 vectorize/LTO, pre patch
January 11 2017
 
A53, post patch
January 14 2017
 
A53 mtune/vectorize, post-patch
January 14 2017
 
Invert Hiding All Results Option
 

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


Cortex A53 GCC7 codegen comparisonProcessorMotherboardMemoryDiskOSKernelCompilerFile-SystemScreen ResolutionA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchAArch64 rev 4 @ 1.50GHz (4 Cores)Amlogic2048MB32GB 00000 + 16GB NCardUbuntu 16.043.14.29 (aarch64)GCC 7.0.0 20170110 + Clang 3.8.0-2ubuntu4 + LLVM 3.8.0ext41920x3240AArch64 rev 4 @ 1.55GHz (4 Cores)GCC 7.0.0 20170113 + Clang 3.8.0-2ubuntu4 + LLVM 3.8.0OpenBenchmarking.orgCompiler Details- --build=aarch64-linux-gnu --disable-browser-plugin --disable-libquadmath --disable-werror --enable-checking=release --enable-clocale=gnu --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,c++,fortran --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --target=aarch64-linux-gnu --with-arch-directory=aarch64 --with-default-libstdcxx-abi=new Disk Details- CFQ / commit=30,errors=remount-ro,noatime,nodiratime,rwProcessor Details- A53 vectorize, pre-patch: Scaling Governor: meson_cpufreq performance- thunderx/vectorize, pre-patch: Scaling Governor: meson_cpufreq performance- A53 vectorize/LTO, pre patch: Scaling Governor: meson_cpufreq performance- A53, post patch: Scaling Governor: meson_cpufreq performance- A53 mtune/vectorize, post-patch: Scaling Governor: meson_cpufreq interactive

A53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patchResult OverviewPhoronix Test Suite100%119%138%157%RAMspeed SMPC-RayTimed MAFFT AlignmentFFTWPrimesieveTachyonTTSIOD 3D RendererRedisPostMarkSudokutSmallptGMPbenchFhourstonesOpenSSL

Cortex A53 GCC7 codegen comparisonpostmark: Disk Transaction Performanceramspeed: Copy - Integerramspeed: Copy - Floating Pointfftw: Stock - 2D FFT Size 2048mafft: Multiple Sequence Alignmentgmpbench: Total Timefhourstones: Complex Connect-4 Solvingttsiod-renderer: Phong Rendering With Soft-Shadow Mappingc-ray: Total Timeprimesieve: 1e12 Prime Number Generationsmallpt: Global Illumination Renderer; 100 Samplessudokut: Total Timetachyon: Total Timeopenssl: RSA 4096-bit Performanceredis: GETA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patch13634581.324580.39196.9035.42552.843212.1023.16187.97543.16167101.9569.2721.50310344.7313512821.432817.45190.6334.46554.833210.2023.01149.82566.21167102.7571.4121.50318926.0213784829.914825.13180.5333.16554.373213.7723.77184.81540.95168101.7567.6421.50311785.0213814965.064964.66186.2133.06552.563209.6723.47186.69553.13168101.8869.4021.50309030.6413784955.974965.60184.8132.17555.103205.4023.49186.61573.13167102.1769.3421.50313438.91OpenBenchmarking.org

PostMark

This is a test of NetApp's PostMark benchmark designed to simulate small-file testing similar to the tasks endured by web and mail servers. This test profile will set PostMark to perform 25,000 transactions with 500 files simultaneously with the file sizes ranging between 5 and 512 kilobytes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTPS, More Is BetterPostMark 1.51Disk Transaction PerformanceA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patch30060090012001500SE +/- 2.67, N = 3SE +/- 0.00, N = 3SE +/- 2.67, N = 3SE +/- 4.33, N = 3SE +/- 5.00, N = 3136313511378138113781. (CC) gcc options: -O3
OpenBenchmarking.orgTPS, More Is BetterPostMark 1.51Disk Transaction PerformanceA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patch2004006008001000Min: 1358 / Avg: 1363.33 / Max: 1366Min: 1351 / Avg: 1351 / Max: 1351Min: 1373 / Avg: 1378.33 / Max: 1381Min: 1373 / Avg: 1380.67 / Max: 1388Min: 1373 / Avg: 1378 / Max: 13881. (CC) gcc options: -O3

RAMspeed SMP

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Copy - Benchmark: IntegerA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patch110022003300440055004581.322821.434829.914965.064955.97

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Copy - Benchmark: Floating PointA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patch110022003300440055004580.392817.454825.134964.664965.60

FFTW

FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.4Build: Stock - Size: 2D FFT Size 2048A53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patch4080120160200SE +/- 0.99, N = 5SE +/- 1.10, N = 5SE +/- 0.49, N = 5SE +/- 0.08, N = 5SE +/- 0.21, N = 5196.90190.63180.53186.21184.81-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mcpu=thunderx -ftree-vectorize-O3 -mcpu=cortex-a53 -ftree-vectorize -flto -ffat-lto-objects-Ofast -mcpu=cortex-a53-Ofast -mtune=cortex-a53 -ftree-vectorize1. (CC) gcc options: -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -lm
OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.4Build: Stock - Size: 2D FFT Size 2048A53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patch4080120160200Min: 194.64 / Avg: 196.9 / Max: 200.5Min: 188.41 / Avg: 190.63 / Max: 193.37Min: 179.17 / Avg: 180.53 / Max: 181.62Min: 185.91 / Avg: 186.21 / Max: 186.38Min: 184.26 / Avg: 184.81 / Max: 185.551. (CC) gcc options: -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -lm

Timed MAFFT Alignment

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed MAFFT Alignment 6.864Multiple Sequence AlignmentA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patch816243240SE +/- 0.80, N = 6SE +/- 0.73, N = 6SE +/- 0.70, N = 6SE +/- 0.71, N = 6SE +/- 0.01, N = 335.4234.4633.1633.0632.171. (CC) gcc options: -O3 -lm -lpthread
OpenBenchmarking.orgSeconds, Fewer Is BetterTimed MAFFT Alignment 6.864Multiple Sequence AlignmentA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patch816243240Min: 32.14 / Avg: 35.42 / Max: 38.35Min: 31.91 / Avg: 34.46 / Max: 35.82Min: 31.84 / Avg: 33.16 / Max: 35.37Min: 31.61 / Avg: 33.06 / Max: 35.43Min: 32.15 / Avg: 32.17 / Max: 32.191. (CC) gcc options: -O3 -lm -lpthread

GMPbench

OpenBenchmarking.orgGMPbench Score, More Is BetterGMPbench 0.2Total TimeA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patch120240360480600552.84554.83554.37552.56555.10-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mcpu=thunderx -ftree-vectorize-O3 -mcpu=cortex-a53 -ftree-vectorize -flto -ffat-lto-objects-Ofast -mcpu=cortex-a53-Ofast -mtune=cortex-a53 -ftree-vectorize1. (CC) gcc options: -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -lm

Fhourstones

OpenBenchmarking.orgKpos / sec, More Is BetterFhourstones 3.1Complex Connect-4 SolvingA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patch7001400210028003500SE +/- 0.35, N = 3SE +/- 0.76, N = 3SE +/- 0.22, N = 3SE +/- 1.47, N = 3SE +/- 1.81, N = 33212.103210.203213.773209.673205.401. (CC) gcc options: -O3
OpenBenchmarking.orgKpos / sec, More Is BetterFhourstones 3.1Complex Connect-4 SolvingA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patch6001200180024003000Min: 3211.5 / Avg: 3212.1 / Max: 3212.7Min: 3208.7 / Avg: 3210.2 / Max: 3211.2Min: 3213.5 / Avg: 3213.77 / Max: 3214.2Min: 3207.2 / Avg: 3209.67 / Max: 3212.3Min: 3203.2 / Avg: 3205.4 / Max: 32091. (CC) gcc options: -O3

TTSIOD 3D Renderer

A portable GPL 3D software renderer that supports OpenMP and Intel Threading Building Blocks with many different rendering modes. This version does not use OpenGL but is entirely CPU/software based. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is BetterTTSIOD 3D Renderer 2.3aPhong Rendering With Soft-Shadow MappingA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patch612182430SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 323.1623.0123.7723.4723.49-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mcpu=thunderx -ftree-vectorize-O3 -mcpu=cortex-a53 -ftree-vectorize -ffat-lto-objects-Ofast -mcpu=cortex-a53-Ofast -mtune=cortex-a53 -ftree-vectorize1. (CXX) g++ options: -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -ffast-math -mtune=native -flto -lSDL -lstdc++
OpenBenchmarking.orgFPS, More Is BetterTTSIOD 3D Renderer 2.3aPhong Rendering With Soft-Shadow MappingA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patch612182430Min: 23.14 / Avg: 23.16 / Max: 23.16Min: 23.01 / Avg: 23.01 / Max: 23.01Min: 23.74 / Avg: 23.77 / Max: 23.78Min: 23.44 / Avg: 23.47 / Max: 23.49Min: 23.47 / Avg: 23.49 / Max: 23.511. (CXX) g++ options: -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -ffast-math -mtune=native -flto -lSDL -lstdc++

C-Ray

This is a test of C-Ray, a simple raytracer designed to test the floating-point CPU performance. This test is multi-threaded (16 threads per core), will shoot 8 rays per pixel for anti-aliasing, and will generate a 1600 x 1200 image. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total TimeA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patch4080120160200SE +/- 0.69, N = 3SE +/- 1.37, N = 3SE +/- 0.17, N = 3SE +/- 0.14, N = 3SE +/- 0.12, N = 3187.97149.82184.81186.69186.61-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mcpu=thunderx -ftree-vectorize-mcpu=cortex-a53 -ftree-vectorize -flto -ffat-lto-objects-Ofast -mcpu=cortex-a53-Ofast -mtune=cortex-a53 -ftree-vectorize1. (CC) gcc options: -lm -lpthread -O3 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc
OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total TimeA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patch306090120150Min: 186.75 / Avg: 187.97 / Max: 189.13Min: 148.36 / Avg: 149.82 / Max: 152.55Min: 184.51 / Avg: 184.81 / Max: 185.1Min: 186.54 / Avg: 186.69 / Max: 186.96Min: 186.37 / Avg: 186.61 / Max: 186.721. (CC) gcc options: -lm -lpthread -O3 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc

Primesieve

OpenBenchmarking.orgSeconds, Fewer Is BetterPrimesieve 5.4.21e12 Prime Number GenerationA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patch120240360480600SE +/- 3.01, N = 3SE +/- 2.99, N = 3SE +/- 8.42, N = 3SE +/- 9.14, N = 3SE +/- 6.92, N = 3543.16566.21540.95553.13573.13-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mcpu=thunderx -ftree-vectorize-O3 -mcpu=cortex-a53 -ftree-vectorize -flto -ffat-lto-objects-Ofast -mcpu=cortex-a53-Ofast -mtune=cortex-a53 -ftree-vectorize1. (CXX) g++ options: -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -fopenmp
OpenBenchmarking.orgSeconds, Fewer Is BetterPrimesieve 5.4.21e12 Prime Number GenerationA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patch100200300400500Min: 537.24 / Avg: 543.16 / Max: 547.08Min: 561.7 / Avg: 566.21 / Max: 571.87Min: 532.14 / Avg: 540.95 / Max: 557.79Min: 536.7 / Avg: 553.13 / Max: 568.29Min: 560.03 / Avg: 573.13 / Max: 583.511. (CXX) g++ options: -fomit-frame-pointer -fipa-pta -march=armv8-a+crc -fopenmp

Smallpt

OpenBenchmarking.orgSeconds, Fewer Is BetterSmallpt 1.0Global Illumination Renderer; 100 SamplesA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patch4080120160200SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3167167168168167-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mcpu=thunderx -ftree-vectorize-O3 -mcpu=cortex-a53 -ftree-vectorize -flto -ffat-lto-objects-Ofast -mcpu=cortex-a53-Ofast -mtune=cortex-a53 -ftree-vectorize1. (CXX) g++ options: -fopenmp -fomit-frame-pointer -fipa-pta -march=armv8-a+crc
OpenBenchmarking.orgSeconds, Fewer Is BetterSmallpt 1.0Global Illumination Renderer; 100 SamplesA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patch306090120150Min: 167 / Avg: 167 / Max: 167Min: 167 / Avg: 167 / Max: 167Min: 168 / Avg: 168 / Max: 168Min: 168 / Avg: 168 / Max: 168Min: 167 / Avg: 167 / Max: 1671. (CXX) g++ options: -fopenmp -fomit-frame-pointer -fipa-pta -march=armv8-a+crc

Sudokut

This is a test of Sudokut, which is a Sudoku puzzle solver written in Tcl. This test measures how long it takes to solve 100 Sudoku puzzles. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterSudokut 0.4Total TimeA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patch20406080100SE +/- 0.20, N = 3SE +/- 0.76, N = 3SE +/- 0.21, N = 3SE +/- 0.10, N = 3SE +/- 0.09, N = 3101.95102.75101.75101.88102.17
OpenBenchmarking.orgSeconds, Fewer Is BetterSudokut 0.4Total TimeA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patch20406080100Min: 101.73 / Avg: 101.95 / Max: 102.35Min: 101.98 / Avg: 102.75 / Max: 104.26Min: 101.51 / Avg: 101.75 / Max: 102.17Min: 101.7 / Avg: 101.88 / Max: 102.05Min: 102.06 / Avg: 102.17 / Max: 102.35

Tachyon

This is a test of the threaded Tachyon, a parallel ray-tracing system. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTachyon 0.98.9Total TimeA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patch1632486480SE +/- 0.08, N = 3SE +/- 0.06, N = 3SE +/- 0.11, N = 3SE +/- 0.12, N = 3SE +/- 0.10, N = 369.2771.4167.6469.4069.34
OpenBenchmarking.orgSeconds, Fewer Is BetterTachyon 0.98.9Total TimeA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patch1428425670Min: 69.18 / Avg: 69.27 / Max: 69.42Min: 71.32 / Avg: 71.41 / Max: 71.52Min: 67.52 / Avg: 67.64 / Max: 67.86Min: 69.2 / Avg: 69.4 / Max: 69.61Min: 69.19 / Avg: 69.34 / Max: 69.54

OpenSSL

OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test measures the RSA 4096-bit performance of OpenSSL. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSigns Per Second, More Is BetterOpenSSL 1.0.1gRSA 4096-bit PerformanceA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patch510152025SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 321.5021.5021.5021.5021.501. (CC) gcc options: -O3 -fomit-frame-pointer -lssl -lcrypto -ldl
OpenBenchmarking.orgSigns Per Second, More Is BetterOpenSSL 1.0.1gRSA 4096-bit PerformanceA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patch510152025Min: 21.5 / Avg: 21.5 / Max: 21.5Min: 21.5 / Avg: 21.5 / Max: 21.5Min: 21.5 / Avg: 21.5 / Max: 21.5Min: 21.5 / Avg: 21.5 / Max: 21.5Min: 21.5 / Avg: 21.5 / Max: 21.51. (CC) gcc options: -O3 -fomit-frame-pointer -lssl -lcrypto -ldl

Redis

Redis is an open-source data structure server. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 3.0.1Test: GETA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patch70K140K210K280K350KSE +/- 4662.92, N = 6SE +/- 2784.59, N = 3SE +/- 2239.53, N = 3SE +/- 1052.91, N = 3SE +/- 1967.34, N = 3310344.73318926.02311785.02309030.64313438.91-Ofast -mcpu=cortex-a53 -ftree-vectorize-Ofast -mcpu=thunderx -ftree-vectorize-O3 -mcpu=cortex-a53 -ftree-vectorize -flto -ffat-lto-objects-Ofast -mcpu=cortex-a53-Ofast -mtune=cortex-a53 -ftree-vectorize1. (CC) gcc options: -ggdb -rdynamic -lm -pthread -ldl -O2 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc
OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 3.0.1Test: GETA53 vectorize, pre-patchthunderx/vectorize, pre-patchA53 vectorize/LTO, pre patchA53, post patchA53 mtune/vectorize, post-patch60K120K180K240K300KMin: 289435.59 / Avg: 310344.73 / Max: 322268.75Min: 314465.41 / Avg: 318926.02 / Max: 324044.06Min: 309310.22 / Avg: 311785.02 / Max: 316255.53Min: 307314.06 / Avg: 309030.64 / Max: 310945.28Min: 309693.41 / Avg: 313438.91 / Max: 316355.561. (CC) gcc options: -ggdb -rdynamic -lm -pthread -ldl -O2 -fomit-frame-pointer -fipa-pta -march=armv8-a+crc