AMD EPYC Rome GCC / LLVM / AOCC Compiler Benchmarks

AMD AOCC 2.0, GCC, LLVM Clang compiler benchmarks on EPYC 7742. Tests by Michael Larabel for a future article.

HTML result view exported from: https://openbenchmarking.org/result/1908094-AS-AOCC20EPY01&grr&sro.

AMD EPYC Rome GCC / LLVM / AOCC Compiler BenchmarksProcessorMotherboardChipsetMemoryDiskGraphicsMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverCompilerFile-SystemScreen ResolutionGCC 9.1.0GCC 10.0 GitLLVM Clang 9.0 SVNAOCC 2.02 x AMD EPYC 7742 64-Core @ 2.25GHz (128 Cores / 256 Threads)AMD DAYTONA_X (RDY1001C BIOS)AMD Device 1480516096MB280GB INTEL SSDPED1D280GA + 6 x 3841GB Micron_9300_MTFDHAL3T8TDP + 256GB Micron_1100_MTFDASPEEDVE2282 x Mellanox MT27710Ubuntu 19.045.2.0-050200rc7-generic (x86_64) 20190630GNOME Shell 3.32.1X Server 1.20.4modesetting 1.20.4GCC 9.1.0ext41920x1080GCC 10.0.0 20190804Clang 9.0.0-svn364739-1~exp1+0~20190701101552.184~1.gbp124358Clang 8.0.0OpenBenchmarking.orgEnvironment Details- CXXFLAGS=-O3-march=znver2 CFLAGS=-O3-march=znver2Compiler Details- GCC 9.1.0: --disable-multilib --enable-checking=release- GCC 10.0 Git: --disable-multilib --enable-checking=release- AOCC 2.0: Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: znver1Processor Details- Scaling Governor: acpi-cpufreq ondemandPython Details- Python 2.7.16 + Python 3.7.3Security Details- l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling

AMD EPYC Rome GCC / LLVM / AOCC Compiler Benchmarkscpp-perf-bench: Rand Numbershpcg: fftw: Stock - 2D FFT Size 4096cpp-perf-bench: Math Libraryapache: Static Web Page Servinggraphics-magick: Resizingcpp-perf-bench: Stepanov Vectorcpp-perf-bench: Atolgraphics-magick: Rotatefftw: Stock - 2D FFT Size 2048cpp-perf-bench: Ctypeaobench: 2048 x 2048 - Total Timecpp-perf-bench: Stepanov Abstractionjohn-the-ripper: Blowfishcoremark: CoreMark Size 666 - Iterations Per Secondscimark2: Compositecpp-perf-bench: Function Objectsx265: H.265 1080p Video Encodingsvt-vp9: 1080p 8-bit YUV To VP9 Video Encodesvt-av1: 1080p 8-bit YUV To AV1 Video Encodedav1d: Summer Nature 4Kc-ray: Total Time - 4K, 16 Rays Per Pixeltjbench: Decompression Throughputsvt-hevc: 1080p 8-bit YUV To HEVC Video Encodedav1d: Summer Nature 1080px264: H.264 Video Encodingtscp: AI Chess Performancescimark2: Jacobi Successive Over-Relaxationscimark2: Dense LU Matrix Factorizationscimark2: Sparse Matrix Multiplyscimark2: Fast Fourier Transformscimark2: Monte CarloGCC 9.1.0GCC 10.0 GitLLVM Clang 9.0 SVNAOCC 2.01599.070.366327.20343.4524915.1910299.2174.362047381.0041.6636.7736.661483023868113.712834.0618.9344.54277.45101.0711.525.72175.16343.294.75153.3110728121801.318811.172741.97203.68612.171627.990.366289.30333.8524096.6010297.7074.322027288.8039.3335.3336.521485423825301.452828.0119.0144.73283.63100.4411.165.82175.48344.044.77152.4310407751800.118762.752763.50202.71610.981892.460.355842.73331.0224594.6611985.8974.102376627.7736.7441.7533.441876523024508.812880.5618.8245.37274.55102.7811.569.19174.13337.754.86153.9511493701655.778518.523382.93224.52621.051894.020.335593.44330.4124163.3112089.7374.382056462.3337.8137.0234.101871403284059.922730.1318.9445.5711.149.01175.964.77157.1810936821656.188473.112736.94178.57605.88OpenBenchmarking.org

CppPerformanceBenchmarks

Test: Random Numbers

OpenBenchmarking.orgSeconds, Fewer Is BetterCppPerformanceBenchmarks 9Test: Random NumbersAOCC 2.0GCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN400800120016002000SE +/- 0.26, N = 3SE +/- 0.12, N = 3SE +/- 0.06, N = 3SE +/- 0.02, N = 31894.021627.991599.071892.461. (CXX) g++ options: -O3 -march=znver2 -std=c++11

High Performance Conjugate Gradient

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.0AOCC 2.0GCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN0.0810.1620.2430.3240.405SE +/- 0.01, N = 12SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 120.330.360.360.35

FFTW

Build: Stock - Size: 2D FFT Size 4096

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Stock - Size: 2D FFT Size 4096AOCC 2.0GCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN14002800420056007000SE +/- 48.86, N = 12SE +/- 67.18, N = 3SE +/- 32.75, N = 3SE +/- 41.90, N = 35593.446289.306327.205842.731. (CC) gcc options: -pthread -O3 -march=znver2 -lm

CppPerformanceBenchmarks

Test: Math Library

OpenBenchmarking.orgSeconds, Fewer Is BetterCppPerformanceBenchmarks 9Test: Math LibraryAOCC 2.0GCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN70140210280350SE +/- 0.31, N = 3SE +/- 0.12, N = 3SE +/- 0.46, N = 3SE +/- 0.43, N = 3330.41333.85343.45331.021. (CXX) g++ options: -O3 -march=znver2 -std=c++11

Apache Benchmark

Static Web Page Serving

OpenBenchmarking.orgRequests Per Second, More Is BetterApache Benchmark 2.4.29Static Web Page ServingAOCC 2.0GCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN5K10K15K20K25KSE +/- 395.53, N = 15SE +/- 497.81, N = 15SE +/- 435.79, N = 15SE +/- 400.39, N = 1524163.3124096.6024915.1924594.661. (CC) gcc options: -shared -fPIC -pthread -O3 -march=znver2

GraphicsMagick

Operation: Resizing

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.30Operation: ResizingAOCC 2.0GCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN306090120150SE +/- 1.53, N = 3SE +/- 0.58, N = 3SE +/- 2.59, N = 12120102102119-lomp-ldl-ldl-lomp1. (CC) gcc options: -fopenmp -O3 -march=znver2 -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread

CppPerformanceBenchmarks

Test: Stepanov Vector

OpenBenchmarking.orgSeconds, Fewer Is BetterCppPerformanceBenchmarks 9Test: Stepanov VectorAOCC 2.0GCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN20406080100SE +/- 0.35, N = 3SE +/- 0.05, N = 3SE +/- 0.02, N = 3SE +/- 0.11, N = 389.7397.7099.2185.891. (CXX) g++ options: -O3 -march=znver2 -std=c++11

CppPerformanceBenchmarks

Test: Atol

OpenBenchmarking.orgSeconds, Fewer Is BetterCppPerformanceBenchmarks 9Test: AtolAOCC 2.0GCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN20406080100SE +/- 0.28, N = 3SE +/- 0.12, N = 3SE +/- 0.23, N = 3SE +/- 0.06, N = 374.3874.3274.3674.101. (CXX) g++ options: -O3 -march=znver2 -std=c++11

GraphicsMagick

Operation: Rotate

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.30Operation: RotateAOCC 2.0GCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN50100150200250SE +/- 1.15, N = 3205202204237-lomp-ldl-ldl-lomp1. (CC) gcc options: -fopenmp -O3 -march=znver2 -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread

FFTW

Build: Stock - Size: 2D FFT Size 2048

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Stock - Size: 2D FFT Size 2048AOCC 2.0GCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN16003200480064008000SE +/- 19.37, N = 3SE +/- 74.47, N = 3SE +/- 15.60, N = 3SE +/- 50.06, N = 36462.337288.807381.006627.771. (CC) gcc options: -pthread -O3 -march=znver2 -lm

CppPerformanceBenchmarks

Test: Ctype

OpenBenchmarking.orgSeconds, Fewer Is BetterCppPerformanceBenchmarks 9Test: CtypeAOCC 2.0GCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN1020304050SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.12, N = 3SE +/- 0.00, N = 337.8139.3341.6636.741. (CXX) g++ options: -O3 -march=znver2 -std=c++11

AOBench

Size: 2048 x 2048 - Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterAOBenchSize: 2048 x 2048 - Total TimeAOCC 2.0GCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN1020304050SE +/- 0.04, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 337.0235.3336.7741.751. (CC) gcc options: -lm -O3 -march=znver2

CppPerformanceBenchmarks

Test: Stepanov Abstraction

OpenBenchmarking.orgSeconds, Fewer Is BetterCppPerformanceBenchmarks 9Test: Stepanov AbstractionAOCC 2.0GCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN816243240SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 334.1036.5236.6633.441. (CXX) g++ options: -O3 -march=znver2 -std=c++11

John The Ripper

Test: Blowfish

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 1.9.0-jumbo-1Test: BlowfishAOCC 2.0GCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN40K80K120K160K200KSE +/- 858.63, N = 3SE +/- 760.59, N = 3SE +/- 794.73, N = 3SE +/- 2145.04, N = 31871401485421483021876521. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -pthread -lm -lz -ldl -lcrypt -lbz2

Coremark

CoreMark Size 666 - Iterations Per Second

OpenBenchmarking.orgIterations/Sec, More Is BetterCoremark 1.0CoreMark Size 666 - Iterations Per SecondAOCC 2.0GCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN800K1600K2400K3200K4000KSE +/- 40813.96, N = 3SE +/- 29615.82, N = 3SE +/- 60538.32, N = 3SE +/- 32092.57, N = 33284059.923825301.453868113.713024508.811. (CC) gcc options: -O2 -O3 -march=znver2 -lrt" -lrt

SciMark

Computational Test: Composite

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: CompositeAOCC 2.0GCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN6001200180024003000SE +/- 4.27, N = 3SE +/- 4.68, N = 3SE +/- 10.94, N = 3SE +/- 9.28, N = 32730.132828.012834.062880.561. (CC) gcc options: -O3 -march=znver2 -lm

CppPerformanceBenchmarks

Test: Function Objects

OpenBenchmarking.orgSeconds, Fewer Is BetterCppPerformanceBenchmarks 9Test: Function ObjectsAOCC 2.0GCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN510152025SE +/- 0.03, N = 3SE +/- 0.08, N = 3SE +/- 0.02, N = 3SE +/- 0.08, N = 318.9419.0118.9318.821. (CXX) g++ options: -O3 -march=znver2 -std=c++11

x265

H.265 1080p Video Encoding

OpenBenchmarking.orgFrames Per Second, More Is Betterx265 3.0H.265 1080p Video EncodingAOCC 2.0GCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN1020304050SE +/- 0.15, N = 3SE +/- 0.21, N = 3SE +/- 0.11, N = 3SE +/- 0.15, N = 345.5744.7344.5445.371. (CXX) g++ options: -O3 -march=znver2 -rdynamic -lpthread -lrt -ldl -lnuma

SVT-VP9

1080p 8-bit YUV To VP9 Video Encode

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 2019-02-171080p 8-bit YUV To VP9 Video EncodeGCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN60120180240300SE +/- 3.30, N = 15SE +/- 4.48, N = 3SE +/- 3.16, N = 3283.63277.45274.55-fPIE -fPIC -O2 -flto -fvisibility=hidden -mavx-fPIE -fPIC -O2 -flto -fvisibility=hidden -mavx1. (CC) gcc options: -O3 -march=znver2 -pie -rdynamic -lpthread -lrt -lm

SVT-AV1

1080p 8-bit YUV To AV1 Video Encode

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 0.51080p 8-bit YUV To AV1 Video EncodeGCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN20406080100SE +/- 0.16, N = 3SE +/- 1.25, N = 5SE +/- 0.32, N = 3100.44101.07102.781. (CXX) g++ options: -O3 -march=znver2 -pie -lpthread -lm

dav1d

Video Input: Summer Nature 4K

OpenBenchmarking.orgSeconds, Fewer Is Betterdav1d 0.3Video Input: Summer Nature 4KAOCC 2.0GCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN3691215SE +/- 0.07, N = 3SE +/- 0.04, N = 3SE +/- 0.15, N = 3SE +/- 0.19, N = 311.1411.1611.5211.561. (CC) gcc options: -O3 -march=znver2 -pthread

C-Ray

Total Time - 4K, 16 Rays Per Pixel

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total Time - 4K, 16 Rays Per PixelAOCC 2.0GCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN3691215SE +/- 0.13, N = 3SE +/- 0.09, N = 3SE +/- 0.09, N = 3SE +/- 0.04, N = 39.015.825.729.191. (CC) gcc options: -lm -lpthread -O3 -march=znver2

libjpeg-turbo tjbench

Test: Decompression Throughput

OpenBenchmarking.orgMegapixels/sec, More Is Betterlibjpeg-turbo tjbench 2.0.2Test: Decompression ThroughputAOCC 2.0GCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN4080120160200SE +/- 0.02, N = 3SE +/- 0.03, N = 3SE +/- 0.02, N = 3SE +/- 0.16, N = 3175.96175.48175.16174.131. (CC) gcc options: -O3 -march=znver2 -rdynamic

SVT-HEVC

1080p 8-bit YUV To HEVC Video Encode

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 2019-02-031080p 8-bit YUV To HEVC Video EncodeGCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN70140210280350SE +/- 3.15, N = 10SE +/- 4.37, N = 3SE +/- 4.13, N = 3344.04343.29337.75-fPIE -fPIC -O2 -flto -fvisibility=hidden -march=native-fPIE -fPIC -O2 -flto -fvisibility=hidden -march=native1. (CC) gcc options: -O3 -march=znver2 -pie -rdynamic -lpthread -lrt

dav1d

Video Input: Summer Nature 1080p

OpenBenchmarking.orgSeconds, Fewer Is Betterdav1d 0.3Video Input: Summer Nature 1080pAOCC 2.0GCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN1.09352.1873.28054.3745.4675SE +/- 0.04, N = 3SE +/- 0.03, N = 3SE +/- 0.04, N = 3SE +/- 0.00, N = 34.774.774.754.861. (CC) gcc options: -O3 -march=znver2 -pthread

x264

H.264 Video Encoding

OpenBenchmarking.orgFrames Per Second, More Is Betterx264 2018-09-25H.264 Video EncodingAOCC 2.0GCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN306090120150SE +/- 2.65, N = 3SE +/- 0.41, N = 3SE +/- 0.91, N = 3SE +/- 0.48, N = 3157.18152.43153.31153.95-mstack-alignment=64-mstack-alignment=641. (CC) gcc options: -ldl -lavformat -lavcodec -lavutil -lswscale -m64 -lm -lpthread -O3 -ffast-math -march=znver2 -std=gnu99 -fPIC -fomit-frame-pointer -fno-tree-vectorize

TSCP

AI Chess Performance

OpenBenchmarking.orgNodes Per Second, More Is BetterTSCP 1.81AI Chess PerformanceAOCC 2.0GCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN200K400K600K800K1000KSE +/- 532.27, N = 5SE +/- 393.00, N = 5SE +/- 1412.87, N = 5SE +/- 479.00, N = 510936821040775107281211493701. (CC) gcc options: -O3 -march=znver2 -march=native

SciMark

Computational Test: Jacobi Successive Over-Relaxation

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Jacobi Successive Over-RelaxationAOCC 2.0GCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN400800120016002000SE +/- 0.43, N = 3SE +/- 0.14, N = 3SE +/- 0.11, N = 3SE +/- 0.16, N = 31656.181800.111801.311655.771. (CC) gcc options: -O3 -march=znver2 -lm

SciMark

Computational Test: Dense LU Matrix Factorization

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Dense LU Matrix FactorizationAOCC 2.0GCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN2K4K6K8K10KSE +/- 10.49, N = 3SE +/- 28.43, N = 3SE +/- 15.96, N = 3SE +/- 6.82, N = 38473.118762.758811.178518.521. (CC) gcc options: -O3 -march=znver2 -lm

SciMark

Computational Test: Sparse Matrix Multiply

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Sparse Matrix MultiplyAOCC 2.0GCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN7001400210028003500SE +/- 14.78, N = 3SE +/- 15.27, N = 3SE +/- 69.70, N = 3SE +/- 47.14, N = 32736.942763.502741.973382.931. (CC) gcc options: -O3 -march=znver2 -lm

SciMark

Computational Test: Fast Fourier Transform

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Fast Fourier TransformAOCC 2.0GCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN50100150200250SE +/- 0.38, N = 3SE +/- 0.44, N = 3SE +/- 0.91, N = 3SE +/- 0.10, N = 3178.57202.71203.68224.521. (CC) gcc options: -O3 -march=znver2 -lm

SciMark

Computational Test: Monte Carlo

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Monte CarloAOCC 2.0GCC 10.0 GitGCC 9.1.0LLVM Clang 9.0 SVN130260390520650SE +/- 0.12, N = 3SE +/- 0.05, N = 3SE +/- 0.53, N = 3SE +/- 0.10, N = 3605.88610.98612.17621.051. (CC) gcc options: -O3 -march=znver2 -lm


Phoronix Test Suite v10.8.4