GCC9 POWER9 Compiler Benchmarks

POWER9 testing with a PowerNV C1P9S01 REV 1.01 compiler benchmarking for a future article.

HTML result view exported from: https://openbenchmarking.org/result/1906236-HV-GCC9POWER43&grr.

GCC9 POWER9 Compiler BenchmarksProcessorMotherboardMemoryDiskGraphicsNetworkOSKernelCompilerFile-SystemScreen ResolutionGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.1POWER9 @ 3.80GHz (4 Cores / 16 Threads)PowerNV C1P9S01 REV 1.01131072MB1024GB SAMSUNG MZVLB1T0HALR-000L7ASPEED3 x Broadcom NetXtreme BCM5719 PCIeUbuntu 19.045.0.0-17-generic (ppc64le)GCC 9.1.0ext41024x768GCC 10.0.0 20190616Clang 8.0.1 + LLVM 8.0.1OpenBenchmarking.orgCompiler Details- GCC 9.1.0: --enable-checking=release- GCC 10.0.0 20190616: --enable-checking=release- LLVM Clang 8.0.1: Optimized build; Default target: powerpc64le-unknown-linux-gnu; Host CPU: pwr9 Processor Details- SMT (threads per core): 4Python Details- Python 2.7.16 + Python 3.7.3Security Details- l1tf: Not affected + mds: Not affected + meltdown: Mitigation of RFI Flush L1D private per thread + spec_store_bypass: Mitigation of Kernel entry/exit barrier (eieio) + spectre_v1: Mitigation of __user pointer sanitization ori31 speculation barrier enabled + spectre_v2: Mitigation of Indirect branch cache disabled

GCC9 POWER9 Compiler Benchmarkscpp-perf-bench: Rand Numberscpp-perf-bench: Math Librarybuild-llvm: Time To Compilex265: H.265 1080p Video Encodingdav1d: Summer Nature 4Kc-ray: Total Time - 4K, 16 Rays Per Pixelhimeno: Poisson Pressure Solverencode-flac: WAV To FLACcpp-perf-bench: Ctypepgbench: Buffer Test - Normal Load - Read Onlypgbench: Buffer Test - Normal Load - Read Writecpp-perf-bench: Stepanov Vectorgraphics-magick: Sharpencpp-perf-bench: Atolx264: H.264 Video Encodingdav1d: Summer Nature 1080pgraphics-magick: Noise-Gaussiangraphics-magick: Enhancedgraphics-magick: Swirlgraphics-magick: Rotategraphics-magick: Resizinggraphics-magick: HWB Color Spaceapache: Static Web Page Servingencode-mp3: WAV To MP3cpp-perf-bench: Stepanov Abstractionredis: LPOPredis: SADDredis: GETsmallpt: Global Illumination Renderer; 128 Samplesredis: LPUSHredis: SETcpp-perf-bench: Function Objectsscimark2: Compositeopenssl: RSA 4096-bit Performancetjbench: Decompression Throughputscimark2: Jacobi Successive Over-Relaxationscimark2: Dense LU Matrix Factorizationscimark2: Sparse Matrix Multiplyscimark2: Fast Fourier Transformscimark2: Monte CarloGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.12156.82781.591260.993.92382.25204.93637.1640.33167.7796860.6910960.37139.6574122.8015.7193.60778513118516817917985.1475.3057.191059020.41710532.17934311.0638.01568842.41585363.5326.41268.65852.30101.36562.62283.27273.69163.7359.952137.23780.841280.223.31378.12205.35636.7940.32167.5796945.1710999.59139.7469121.0413.6392.02758312818416517917853.3375.8457.181045275.77725169.62942834.6237.68574786.06571452.3626.39263.02838.97101.14560.31255.18274.26164.0561.294529.86775.781060.28392.45667.90581.5948.75153.2694535.8010917.57131.3718120.2115.3390.141622441588511617819.9519.2953.071007820.41728590.02993911.90571604.42571064.3627.88277.43791.23105.05588.43285.92269.37174.4868.98OpenBenchmarking.org

CppPerformanceBenchmarks

Test: Random Numbers

OpenBenchmarking.orgSeconds, Fewer Is BetterCppPerformanceBenchmarks 9Test: Random NumbersGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.110002000300040005000SE +/- 3.96, N = 3SE +/- 0.13, N = 3SE +/- 0.45, N = 32156.822137.234529.861. (CXX) g++ options: -std=c++11 -O3

CppPerformanceBenchmarks

Test: Math Library

OpenBenchmarking.orgSeconds, Fewer Is BetterCppPerformanceBenchmarks 9Test: Math LibraryGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.12004006008001000SE +/- 1.01, N = 3SE +/- 0.13, N = 3SE +/- 0.07, N = 3781.59780.84775.781. (CXX) g++ options: -std=c++11 -O3

Timed LLVM Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 6.0.1Time To CompileGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.1300600900120015001260.991280.221060.28

x265

H.265 1080p Video Encoding

OpenBenchmarking.orgFrames Per Second, More Is Betterx265 3.0H.265 1080p Video EncodingGCC 9.1.0GCC 10.0.0 201906160.8821.7642.6463.5284.41SE +/- 0.07, N = 12SE +/- 0.02, N = 33.923.311. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma

dav1d

Video Input: Summer Nature 4K

OpenBenchmarking.orgSeconds, Fewer Is Betterdav1d 0.3Video Input: Summer Nature 4KGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.190180270360450SE +/- 0.80, N = 3SE +/- 0.74, N = 3SE +/- 3.25, N = 3382.25378.12392.451. (CC) gcc options: -pthread

C-Ray

Total Time - 4K, 16 Rays Per Pixel

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total Time - 4K, 16 Rays Per PixelGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.1140280420560700SE +/- 1.98, N = 3SE +/- 0.35, N = 3SE +/- 0.57, N = 3204.93205.35667.901. (CC) gcc options: -lm -lpthread -O3

Himeno Benchmark

Poisson Pressure Solver

OpenBenchmarking.orgMFLOPS, More Is BetterHimeno Benchmark 3.0Poisson Pressure SolverGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.1140280420560700SE +/- 6.46, N = 15SE +/- 3.16, N = 3SE +/- 6.53, N = 15637.16636.79581.591. (CC) gcc options: -O3

FLAC Audio Encoding

WAV To FLAC

OpenBenchmarking.orgSeconds, Fewer Is BetterFLAC Audio Encoding 1.3.2WAV To FLACGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.11122334455SE +/- 0.30, N = 16SE +/- 0.35, N = 12SE +/- 0.44, N = 1040.3340.3248.75-fvisibility=hidden-fvisibility=hidden1. (CXX) g++ options: -O2 -lm

CppPerformanceBenchmarks

Test: Ctype

OpenBenchmarking.orgSeconds, Fewer Is BetterCppPerformanceBenchmarks 9Test: CtypeGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.14080120160200SE +/- 0.27, N = 3SE +/- 0.03, N = 3SE +/- 0.32, N = 3167.77167.57153.261. (CXX) g++ options: -std=c++11 -O3

PostgreSQL pgbench

Scaling: Buffer Test - Test: Normal Load - Mode: Read Only

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 10.3Scaling: Buffer Test - Test: Normal Load - Mode: Read OnlyGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.120K40K60K80K100KSE +/- 1248.50, N = 3SE +/- 876.13, N = 3SE +/- 932.17, N = 396860.6996945.1794535.801. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lpthread -lrt -lcrypt -ldl -lm

PostgreSQL pgbench

Scaling: Buffer Test - Test: Normal Load - Mode: Read Write

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 10.3Scaling: Buffer Test - Test: Normal Load - Mode: Read WriteGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.12K4K6K8K10KSE +/- 35.67, N = 3SE +/- 15.98, N = 3SE +/- 9.76, N = 310960.3710999.5910917.571. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lpthread -lrt -lcrypt -ldl -lm

CppPerformanceBenchmarks

Test: Stepanov Vector

OpenBenchmarking.orgSeconds, Fewer Is BetterCppPerformanceBenchmarks 9Test: Stepanov VectorGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.1306090120150SE +/- 0.00, N = 3SE +/- 0.03, N = 3SE +/- 0.00, N = 3139.65139.74131.371. (CXX) g++ options: -std=c++11 -O3

GraphicsMagick

Operation: Sharpen

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.30Operation: SharpenGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.11632486480SE +/- 0.67, N = 3SE +/- 0.55, N = 13746918-fopenmp -ldl-fopenmp -ldl1. (CC) gcc options: -O2 -pthread -lSM -lICE -lX11 -lz -lm -lpthread

CppPerformanceBenchmarks

Test: Atol

OpenBenchmarking.orgSeconds, Fewer Is BetterCppPerformanceBenchmarks 9Test: AtolGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.1306090120150SE +/- 0.14, N = 3SE +/- 0.22, N = 3SE +/- 0.10, N = 3122.80121.04120.211. (CXX) g++ options: -std=c++11 -O3

x264

H.264 Video Encoding

OpenBenchmarking.orgFrames Per Second, More Is Betterx264 2018-09-25H.264 Video EncodingGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.148121620SE +/- 0.03, N = 3SE +/- 0.20, N = 15SE +/- 0.25, N = 315.7113.6315.331. (CC) gcc options: -ldl -lm -lpthread -O3 -ffast-math -maltivec -mabi=altivec -mvsx -std=gnu99 -fPIC -fomit-frame-pointer -fno-tree-vectorize

dav1d

Video Input: Summer Nature 1080p

OpenBenchmarking.orgSeconds, Fewer Is Betterdav1d 0.3Video Input: Summer Nature 1080pGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.120406080100SE +/- 0.62, N = 3SE +/- 0.62, N = 3SE +/- 0.98, N = 393.6092.0290.141. (CC) gcc options: -pthread

GraphicsMagick

Operation: Noise-Gaussian

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.30Operation: Noise-GaussianGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.120406080100SE +/- 1.20, N = 3SE +/- 1.15, N = 3777516-fopenmp -ldl-fopenmp -ldl1. (CC) gcc options: -O2 -pthread -lSM -lICE -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: Enhanced

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.30Operation: EnhancedGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.120406080100858322-fopenmp -ldl-fopenmp -ldl1. (CC) gcc options: -O2 -pthread -lSM -lICE -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: Swirl

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.30Operation: SwirlGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.1306090120150SE +/- 1.45, N = 3SE +/- 1.45, N = 3SE +/- 0.33, N = 313112844-fopenmp -ldl-fopenmp -ldl1. (CC) gcc options: -O2 -pthread -lSM -lICE -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: Rotate

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.30Operation: RotateGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.14080120160200SE +/- 0.67, N = 3185184158-fopenmp -ldl-fopenmp -ldl1. (CC) gcc options: -O2 -pthread -lSM -lICE -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: Resizing

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.30Operation: ResizingGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.14080120160200SE +/- 0.33, N = 3SE +/- 1.76, N = 316816585-fopenmp -ldl-fopenmp -ldl1. (CC) gcc options: -O2 -pthread -lSM -lICE -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: HWB Color Space

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.30Operation: HWB Color SpaceGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.14080120160200SE +/- 0.33, N = 3SE +/- 0.33, N = 3179179116-fopenmp -ldl-fopenmp -ldl1. (CC) gcc options: -O2 -pthread -lSM -lICE -lX11 -lz -lm -lpthread

Apache Benchmark

Static Web Page Serving

OpenBenchmarking.orgRequests Per Second, More Is BetterApache Benchmark 2.4.29Static Web Page ServingGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.14K8K12K16K20KSE +/- 65.06, N = 3SE +/- 55.87, N = 3SE +/- 40.71, N = 317985.1417853.3317819.951. (CC) gcc options: -shared -fPIC -O2 -pthread

LAME MP3 Encoding

WAV To MP3

OpenBenchmarking.orgSeconds, Fewer Is BetterLAME MP3 Encoding 3.100WAV To MP3GCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.120406080100SE +/- 0.12, N = 3SE +/- 0.04, N = 3SE +/- 0.07, N = 375.3075.8419.29-O3 -pipe1. (CC) gcc options: -lm

CppPerformanceBenchmarks

Test: Stepanov Abstraction

OpenBenchmarking.orgSeconds, Fewer Is BetterCppPerformanceBenchmarks 9Test: Stepanov AbstractionGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.11326395265SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 357.1957.1853.071. (CXX) g++ options: -std=c++11 -O3

Redis

Test: LPOP

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 4.0.8Test: LPOPGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.1200K400K600K800K1000KSE +/- 16618.25, N = 15SE +/- 16475.39, N = 15SE +/- 16615.22, N = 151059020.411045275.771007820.411. (CC) gcc options: -ggdb -rdynamic -lm -ldl -pthread

Redis

Test: SADD

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 4.0.8Test: SADDGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.1160K320K480K640K800KSE +/- 7312.11, N = 15SE +/- 8548.45, N = 15SE +/- 9654.52, N = 3710532.17725169.62728590.021. (CC) gcc options: -ggdb -rdynamic -lm -ldl -pthread

Redis

Test: GET

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 4.0.8Test: GETGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.1200K400K600K800K1000KSE +/- 14189.58, N = 15SE +/- 16704.21, N = 15SE +/- 16428.18, N = 3934311.06942834.62993911.901. (CC) gcc options: -ggdb -rdynamic -lm -ldl -pthread

Smallpt

Global Illumination Renderer; 128 Samples

OpenBenchmarking.orgSeconds, Fewer Is BetterSmallpt 1.0Global Illumination Renderer; 128 SamplesGCC 9.1.0GCC 10.0.0 20190616918273645SE +/- 0.09, N = 3SE +/- 0.12, N = 338.0137.681. (CXX) g++ options: -fopenmp -O3

Redis

Test: LPUSH

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 4.0.8Test: LPUSHGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.1120K240K360K480K600KSE +/- 6664.54, N = 6SE +/- 7303.99, N = 3SE +/- 4782.46, N = 15568842.41574786.06571604.421. (CC) gcc options: -ggdb -rdynamic -lm -ldl -pthread

Redis

Test: SET

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 4.0.8Test: SETGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.1130K260K390K520K650KSE +/- 7818.46, N = 4SE +/- 6076.05, N = 15SE +/- 7412.36, N = 4585363.53571452.36571064.361. (CC) gcc options: -ggdb -rdynamic -lm -ldl -pthread

CppPerformanceBenchmarks

Test: Function Objects

OpenBenchmarking.orgSeconds, Fewer Is BetterCppPerformanceBenchmarks 9Test: Function ObjectsGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.1714212835SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 326.4126.3927.881. (CXX) g++ options: -std=c++11 -O3

SciMark

Computational Test: Composite

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: CompositeGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.160120180240300SE +/- 2.15, N = 3SE +/- 2.17, N = 3SE +/- 0.93, N = 3268.65263.02277.431. (CC) gcc options: -lm

OpenSSL

RSA 4096-bit Performance

OpenBenchmarking.orgSigns Per Second, More Is BetterOpenSSL 1.1.1RSA 4096-bit PerformanceGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.12004006008001000SE +/- 0.15, N = 3SE +/- 0.09, N = 3SE +/- 0.18, N = 3852.30838.97791.23-Qunused-arguments1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

libjpeg-turbo tjbench

Test: Decompression Throughput

OpenBenchmarking.orgMegapixels/sec, More Is Betterlibjpeg-turbo tjbench 1.5.3Test: Decompression ThroughputGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.120406080100SE +/- 0.03, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 3101.36101.14105.05-lm-lm1. (CC) gcc options: -O3

SciMark

Computational Test: Jacobi Successive Over-Relaxation

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Jacobi Successive Over-RelaxationGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.1130260390520650SE +/- 4.92, N = 3SE +/- 3.41, N = 3SE +/- 0.17, N = 3562.62560.31588.431. (CC) gcc options: -lm

SciMark

Computational Test: Dense LU Matrix Factorization

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Dense LU Matrix FactorizationGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.160120180240300SE +/- 12.96, N = 3SE +/- 8.31, N = 3SE +/- 2.83, N = 3283.27255.18285.921. (CC) gcc options: -lm

SciMark

Computational Test: Sparse Matrix Multiply

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Sparse Matrix MultiplyGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.160120180240300SE +/- 0.27, N = 3SE +/- 0.09, N = 3SE +/- 0.52, N = 3273.69274.26269.371. (CC) gcc options: -lm

SciMark

Computational Test: Fast Fourier Transform

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Fast Fourier TransformGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.14080120160200SE +/- 0.58, N = 3SE +/- 0.16, N = 3SE +/- 0.58, N = 3163.73164.05174.481. (CC) gcc options: -lm

SciMark

Computational Test: Monte Carlo

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Monte CarloGCC 9.1.0GCC 10.0.0 20190616LLVM Clang 8.0.11530456075SE +/- 0.03, N = 3SE +/- 0.01, N = 3SE +/- 1.38, N = 359.9561.2968.981. (CC) gcc options: -lm


Phoronix Test Suite v10.8.4