GCC 8.0 vs. Clang 6.0 AMD EPYC Tuning Comparison

Tests for a future article on Phoronix.

HTML result view exported from: https://openbenchmarking.org/result/1801025-AL-CLANGCC0290&obr_imw=y&gru&rdt.

ProcessorMotherboardChipsetMemoryDiskGraphicsMonitorNetworkOSKernelDesktopDisplay DriverOpenCLCompilerFile-SystemScreen ResolutionSystem LayerClang 6.0GCC 8.0 znver1 x86-64 znver1 x86-64AMD EPYC 7601 32-Core @ 2.20GHz (64 Cores)TYAN B8026T70AE24HRAMD Device 1450126976MB280GB INTEL SSDPE21D280GAASPEED ASPEED FamilyVE228Broadcom Limited NetXtreme BCM5720 Gigabit PCIeUbuntu 17.104.13.0-21-generic (x86_64)GNOME Shell 3.26.1modesetting 1.19.5OpenCL 1.2 pocl 1.0 LLVM 5.0.0Clang 6.0.0 (SVN 321623) + LLVM 6.0.0svnext41920x1080vm-other Xen 4.9.0 HypervisorGCC 8.0.0 20171231 + clang (GCC) 8.0.0 20171231 (experimental) + LLVM 5.0.0OpenBenchmarking.orgCompiler Details- Clang 6.0: znver1: Optimized build; Default target: x86_64-unknown-linux-gnu; Host CPU: znver1- Clang 6.0: x86-64: Optimized build; Default target: x86_64-unknown-linux-gnu; Host CPU: znver1- GCC 8.0: znver1: --disable-multilib --enable-checking=release- GCC 8.0: x86-64: --disable-multilib --enable-checking=releaseDisk Details- NONE / data=ordered,errors=remount-ro,relatime,rw

graphics-magick: Blurgraphics-magick: Sharpengraphics-magick: HWB Color Spacegraphics-magick: Local Adaptive Thresholdingfftw: Stock - 2D FFT Size 4096fftw: Float + SSE - 2D FFT Size 4096scimark2: Compositescimark2: Monte Carloscimark2: Fast Fourier Transformscimark2: Sparse Matrix Multiplyscimark2: Dense LU Matrix Factorizationscimark2: Jacobi Successive Over-Relaxationhimeno: Poisson Pressure Solvertscp: AI Chess Performanceebizzy: apache: Static Web Page Servingsqlite: Default Test Directorypolybench-c: 3 Matrix Multiplicationshmmer: Pfam Database Searchc-ray: Total Timebullet: Raytestsbullet: 3000 Fallbullet: 1000 Stackbullet: 1000 Convexbullet: 136 Ragdollsbullet: Prim Trimeshbullet: Convex Trimeshencode-flac: WAV To FLACencode-mp3: WAV To MP3Clang 6.0GCC 8.0 znver1 x86-64 znver1 x86-64104136155985031.60124811699.32552.19226.682258.644034.891424.211052.4791826911454059663.937.4862.7511.094.483.185.346.085.313.231.091.326.6312.81101131150974660.83136491479.53531.38179.292190.103190.431110.651032.7191765810766489531.437.5362.9812.854.533.225.486.305.433.281.101.337.9411.33123165186955627.83136301680.45555.76231.092259.953678.861676.62935.6487508511011769791.237.1665.4512.403.373.065.275.935.283.191.101.307.4510.81116157177924959.731579.48561.03233.892263.873513.111423.14949.1987425111260329841.307.6160.6813.653.933.125.346.185.443.261.101.347.1211.10OpenBenchmarking.org

GraphicsMagick

Operation: Blur

znver1x86-64OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.19Operation: BlurClang 6.0GCC 8.0306090120150SE +/- 0.33, N = 3SE +/- 0.33, N = 31041231011161. (CC) gcc options: -fopenmp -O3 -march=x86-64 -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread

GraphicsMagick

Operation: Sharpen

znver1x86-64OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.19Operation: SharpenClang 6.0GCC 8.04080120160200SE +/- 0.67, N = 3SE +/- 0.33, N = 31361651311571. (CC) gcc options: -fopenmp -O3 -march=x86-64 -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread

GraphicsMagick

Operation: HWB Color Space

znver1x86-64OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.19Operation: HWB Color SpaceClang 6.0GCC 8.04080120160200SE +/- 0.88, N = 3SE +/- 1.20, N = 31551861501771. (CC) gcc options: -fopenmp -O3 -march=x86-64 -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread

GraphicsMagick

Operation: Local Adaptive Thresholding

znver1x86-64OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.19Operation: Local Adaptive ThresholdingClang 6.0GCC 8.020406080100SE +/- 0.33, N = 3SE +/- 0.33, N = 3989597921. (CC) gcc options: -fopenmp -O3 -march=x86-64 -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread

FFTW

Build: Stock - Size: 2D FFT Size 4096

znver1x86-64OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Stock - Size: 2D FFT Size 4096Clang 6.0GCC 8.012002400360048006000SE +/- 17.93, N = 3SE +/- 10.37, N = 3SE +/- 57.25, N = 3SE +/- 1.16, N = 35031.605627.834660.834959.731. (CC) gcc options: -pthread -O3 -lm

FFTW

Build: Float + SSE - Size: 2D FFT Size 4096

znver1x86-64OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Float + SSE - Size: 2D FFT Size 4096Clang 6.0GCC 8.03K6K9K12K15KSE +/- 76.70, N = 3SE +/- 15.38, N = 3SE +/- 100.95, N = 31248113630136491. (CC) gcc options: -pthread -O3 -lm

SciMark

Computational Test: Composite

znver1x86-64OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: CompositeClang 6.0GCC 8.0400800120016002000SE +/- 25.95, N = 4SE +/- 20.36, N = 4SE +/- 36.18, N = 8SE +/- 20.51, N = 71699.321680.451479.531579.481. (CC) gcc options: -O3 -march=x86-64 -lm

SciMark

Computational Test: Monte Carlo

znver1x86-64OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Monte CarloClang 6.0GCC 8.0120240360480600SE +/- 0.05, N = 4SE +/- 0.01, N = 4SE +/- 0.05, N = 4SE +/- 0.04, N = 4552.19555.76531.38561.031. (CC) gcc options: -O3 -march=x86-64 -lm

SciMark

Computational Test: Fast Fourier Transform

znver1x86-64OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Fast Fourier TransformClang 6.0GCC 8.050100150200250SE +/- 0.38, N = 4SE +/- 0.14, N = 4SE +/- 45.12, N = 4SE +/- 0.35, N = 4226.68231.09179.29233.891. (CC) gcc options: -O3 -march=x86-64 -lm

SciMark

Computational Test: Sparse Matrix Multiply

znver1x86-64OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Sparse Matrix MultiplyClang 6.0GCC 8.05001000150020002500SE +/- 29.07, N = 4SE +/- 9.55, N = 4SE +/- 11.89, N = 4SE +/- 6.72, N = 42258.642259.952190.102263.871. (CC) gcc options: -O3 -march=x86-64 -lm

SciMark

Computational Test: Dense LU Matrix Factorization

znver1x86-64OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Dense LU Matrix FactorizationClang 6.0GCC 8.09001800270036004500SE +/- 135.63, N = 4SE +/- 110.38, N = 4SE +/- 17.92, N = 4SE +/- 177.13, N = 44034.893678.863190.433513.111. (CC) gcc options: -O3 -march=x86-64 -lm

SciMark

Computational Test: Jacobi Successive Over-Relaxation

znver1x86-64OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Jacobi Successive Over-RelaxationClang 6.0GCC 8.0400800120016002000SE +/- 0.49, N = 4SE +/- 0.77, N = 4SE +/- 310.16, N = 4SE +/- 0.28, N = 41424.211676.621110.651423.141. (CC) gcc options: -O3 -march=x86-64 -lm

Himeno Benchmark

Poisson Pressure Solver

znver1x86-64OpenBenchmarking.orgMFLOPS, More Is BetterHimeno Benchmark 3.0Poisson Pressure SolverClang 6.0GCC 8.02004006008001000SE +/- 1.50, N = 3SE +/- 0.79, N = 3SE +/- 1.46, N = 3SE +/- 0.43, N = 31052.47935.641032.71949.191. (CC) gcc options: -O3 -march=x86-64 -mavx2

TSCP

AI Chess Performance

znver1x86-64OpenBenchmarking.orgNodes Per Second, More Is BetterTSCP 1.81AI Chess PerformanceClang 6.0GCC 8.0200K400K600K800K1000KSE +/- 306.40, N = 5SE +/- 556.88, N = 5SE +/- 572.37, N = 5SE +/- 438.77, N = 59182698750859176588742511. (CC) gcc options: -O3 -march=x86-64 -march=native

ebizzy

znver1x86-64OpenBenchmarking.orgRecords/s, More Is Betterebizzy 0.3Clang 6.0GCC 8.0200K400K600K800K1000KSE +/- 17141.58, N = 6SE +/- 19461.21, N = 3SE +/- 12350.97, N = 3SE +/- 20747.16, N = 611454051101176107664811260321. (CC) gcc options: -pthread -lpthread -O3 -march=x86-64 -march=native

Apache Benchmark

Static Web Page Serving

znver1x86-64OpenBenchmarking.orgRequests Per Second, More Is BetterApache Benchmark 2.4.7Static Web Page ServingClang 6.0GCC 8.02K4K6K8K10KSE +/- 121.20, N = 3SE +/- 35.54, N = 3SE +/- 161.49, N = 3SE +/- 22.88, N = 39663.939791.239531.439841.301. (CC) gcc options: -shared -fPIC -pthread -O3 -march=x86-64

SQLite

Test Target: Default Test Directory

znver1x86-64OpenBenchmarking.orgSeconds, Fewer Is BetterSQLite 3.8.10.2Test Target: Default Test DirectoryClang 6.0GCC 8.0246810SE +/- 0.15, N = 6SE +/- 0.12, N = 6SE +/- 0.06, N = 3SE +/- 0.12, N = 37.487.167.537.611. (CC) gcc options: -O3 -march=x86-64 -ldl -lpthread

PolyBench-C

Test: 3 Matrix Multiplications

znver1x86-64OpenBenchmarking.orgSeconds, Fewer Is BetterPolyBench-C 3.2Test: 3 Matrix MultiplicationsClang 6.0GCC 8.01530456075SE +/- 0.26, N = 3SE +/- 0.21, N = 3SE +/- 0.24, N = 3SE +/- 0.03, N = 362.7565.4562.9860.681. (CC) gcc options: -O3 -march=x86-64 -march=native

Timed HMMer Search

Pfam Database Search

znver1x86-64OpenBenchmarking.orgSeconds, Fewer Is BetterTimed HMMer Search 2.3.2Pfam Database SearchClang 6.0GCC 8.048121620SE +/- 0.12, N = 3SE +/- 0.04, N = 3SE +/- 1.28, N = 6SE +/- 1.74, N = 611.0912.4012.8513.651. (CC) gcc options: -O3 -march=x86-64 -pthread -lhmmer -lsquid -lm

C-Ray

Total Time

znver1x86-64OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total TimeClang 6.0GCC 8.01.01932.03863.05794.07725.0965SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.02, N = 3SE +/- 0.00, N = 34.483.374.533.931. (CC) gcc options: -lm -lpthread -O3 -march=x86-64

Bullet Physics Engine

Test: Raytests

znver1x86-64OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: RaytestsClang 6.0GCC 8.00.72451.4492.17352.8983.6225SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 33.183.063.223.121. (CXX) g++ options: -O3 -march=x86-64 -rdynamic

Bullet Physics Engine

Test: 3000 Fall

znver1x86-64OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: 3000 FallClang 6.0GCC 8.01.2332.4663.6994.9326.165SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 35.345.275.485.341. (CXX) g++ options: -O3 -march=x86-64 -rdynamic

Bullet Physics Engine

Test: 1000 Stack

znver1x86-64OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: 1000 StackClang 6.0GCC 8.0246810SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.03, N = 3SE +/- 0.02, N = 36.085.936.306.181. (CXX) g++ options: -O3 -march=x86-64 -rdynamic

Bullet Physics Engine

Test: 1000 Convex

znver1x86-64OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: 1000 ConvexClang 6.0GCC 8.01.2242.4483.6724.8966.12SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 35.315.285.435.441. (CXX) g++ options: -O3 -march=x86-64 -rdynamic

Bullet Physics Engine

Test: 136 Ragdolls

znver1x86-64OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: 136 RagdollsClang 6.0GCC 8.00.7381.4762.2142.9523.69SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 33.233.193.283.261. (CXX) g++ options: -O3 -march=x86-64 -rdynamic

Bullet Physics Engine

Test: Prim Trimesh

znver1x86-64OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: Prim TrimeshClang 6.0GCC 8.00.24750.4950.74250.991.2375SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 31.091.101.101.101. (CXX) g++ options: -O3 -march=x86-64 -rdynamic

Bullet Physics Engine

Test: Convex Trimesh

znver1x86-64OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: Convex TrimeshClang 6.0GCC 8.00.30150.6030.90451.2061.5075SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 31.321.301.331.341. (CXX) g++ options: -O3 -march=x86-64 -rdynamic

FLAC Audio Encoding

WAV To FLAC

znver1x86-64OpenBenchmarking.orgSeconds, Fewer Is BetterFLAC Audio Encoding 1.3.1WAV To FLACClang 6.0GCC 8.0246810SE +/- 0.00, N = 5SE +/- 0.01, N = 5SE +/- 0.01, N = 5SE +/- 0.01, N = 56.637.457.947.121. (CXX) g++ options: -O3 -march=x86-64 -logg -lm

LAME MP3 Encoding

WAV To MP3

znver1x86-64OpenBenchmarking.orgSeconds, Fewer Is BetterLAME MP3 Encoding 3.99.5WAV To MP3Clang 6.0GCC 8.03691215SE +/- 0.01, N = 5SE +/- 0.01, N = 5SE +/- 0.00, N = 5SE +/- 0.02, N = 512.8110.8111.3311.101. (CC) gcc options: -O3 -ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr -pipe -march=x86-64 -lm


Phoronix Test Suite v10.8.4