GCC 8.0 vs. Clang 6.0 AMD EPYC Tuning Comparison

Tests for a future article on Phoronix.

HTML result view exported from: https://openbenchmarking.org/result/1801025-AL-CLANGCC0290&sor&grs.

ProcessorMotherboardChipsetMemoryDiskGraphicsMonitorNetworkOSKernelDesktopDisplay DriverOpenCLCompilerFile-SystemScreen ResolutionSystem LayerGCC 8.0Clang 6.0 x86-64 znver1 x86-64 znver1AMD EPYC 7601 32-Core @ 2.20GHz (64 Cores)TYAN B8026T70AE24HRAMD Device 1450126976MB280GB INTEL SSDPE21D280GAASPEED ASPEED FamilyVE228Broadcom Limited NetXtreme BCM5720 Gigabit PCIeUbuntu 17.104.13.0-21-generic (x86_64)GNOME Shell 3.26.1modesetting 1.19.5OpenCL 1.2 pocl 1.0 LLVM 5.0.0GCC 8.0.0 20171231 + clang (GCC) 8.0.0 20171231 (experimental) + LLVM 5.0.0ext41920x1080vm-other Xen 4.9.0 HypervisorClang 6.0.0 (SVN 321623) + LLVM 6.0.0svnOpenBenchmarking.orgCompiler Details- GCC 8.0: x86-64: --disable-multilib --enable-checking=release- GCC 8.0: znver1: --disable-multilib --enable-checking=release- Clang 6.0: x86-64: Optimized build; Default target: x86_64-unknown-linux-gnu; Host CPU: znver1- Clang 6.0: znver1: Optimized build; Default target: x86_64-unknown-linux-gnu; Host CPU: znver1Disk Details- NONE / data=ordered,errors=remount-ro,relatime,rw

c-ray: Total Timegraphics-magick: Sharpengraphics-magick: HWB Color Spacegraphics-magick: Blurfftw: Stock - 2D FFT Size 4096encode-flac: WAV To FLACencode-mp3: WAV To MP3himeno: Poisson Pressure Solverfftw: Float + SSE - 2D FFT Size 4096polybench-c: 3 Matrix Multiplicationsgraphics-magick: Local Adaptive Thresholdingebizzy: sqlite: Default Test Directorybullet: 1000 Stackscimark2: Monte Carlobullet: Rayteststscp: AI Chess Performancebullet: 3000 Fallscimark2: Sparse Matrix Multiplyapache: Static Web Page Servingbullet: Convex Trimeshbullet: 1000 Convexbullet: 136 Ragdollsbullet: Prim Trimeshscimark2: Jacobi Successive Over-Relaxationscimark2: Dense LU Matrix Factorizationscimark2: Fast Fourier Transformscimark2: Compositehmmer: Pfam Database SearchGCC 8.0Clang 6.0 x86-64 znver1 x86-64 znver13.931571771164959.737.1211.10949.1960.689211260327.616.18561.033.128742515.342263.879841.301.345.443.261.101423.143513.11233.891579.4813.653.371651861235627.837.4510.81935.641363065.459511011767.165.93555.763.068750855.272259.959791.231.305.283.191.101676.623678.86231.091680.4512.404.531311501014660.837.9411.331032.711364962.989710766487.536.30531.383.229176585.482190.109531.431.335.433.281.101110.653190.43179.291479.5312.854.481361551045031.606.6312.811052.471248162.759811454057.486.08552.193.189182695.342258.649663.931.325.313.231.091424.214034.89226.681699.3211.09OpenBenchmarking.org

C-Ray

Total Time

GCC 8.0Clang 6.0OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total Timeznver1x86-641.01932.03863.05794.07725.0965SE +/- 0.02, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 33.373.934.484.531. (CC) gcc options: -lm -lpthread -O3

GraphicsMagick

Operation: Sharpen

GCC 8.0Clang 6.0OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.19Operation: Sharpenznver1x86-644080120160200SE +/- 0.33, N = 3SE +/- 0.67, N = 31651571361311. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lgomp -lpthread

GraphicsMagick

Operation: HWB Color Space

GCC 8.0Clang 6.0OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.19Operation: HWB Color Spaceznver1x86-644080120160200SE +/- 0.88, N = 3SE +/- 1.20, N = 31861771551501. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lgomp -lpthread

GraphicsMagick

Operation: Blur

GCC 8.0Clang 6.0OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.19Operation: Blurznver1x86-64306090120150SE +/- 0.33, N = 3SE +/- 0.33, N = 31231161041011. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lgomp -lpthread

FFTW

Build: Stock - Size: 2D FFT Size 4096

GCC 8.0Clang 6.0OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Stock - Size: 2D FFT Size 4096znver1x86-6412002400360048006000SE +/- 10.37, N = 3SE +/- 1.16, N = 3SE +/- 17.93, N = 3SE +/- 57.25, N = 35627.834959.735031.604660.831. (CC) gcc options: -pthread -O3 -lm

FLAC Audio Encoding

WAV To FLAC

Clang 6.0GCC 8.0OpenBenchmarking.orgSeconds, Fewer Is BetterFLAC Audio Encoding 1.3.1WAV To FLACznver1x86-64246810SE +/- 0.00, N = 5SE +/- 0.01, N = 5SE +/- 0.01, N = 5SE +/- 0.01, N = 56.637.947.127.451. (CXX) g++ options: -O3 -logg -lm

LAME MP3 Encoding

WAV To MP3

GCC 8.0Clang 6.0OpenBenchmarking.orgSeconds, Fewer Is BetterLAME MP3 Encoding 3.99.5WAV To MP3znver1x86-643691215SE +/- 0.01, N = 5SE +/- 0.02, N = 5SE +/- 0.00, N = 5SE +/- 0.01, N = 510.8111.1011.3312.811. (CC) gcc options: -O3 -ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr -pipe -lm

Himeno Benchmark

Poisson Pressure Solver

Clang 6.0GCC 8.0OpenBenchmarking.orgMFLOPS, More Is BetterHimeno Benchmark 3.0Poisson Pressure Solverznver1x86-642004006008001000SE +/- 1.50, N = 3SE +/- 1.46, N = 3SE +/- 0.43, N = 3SE +/- 0.79, N = 31052.471032.71949.19935.641. (CC) gcc options: -O3 -mavx2

FFTW

Build: Float + SSE - Size: 2D FFT Size 4096

Clang 6.0GCC 8.0OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Float + SSE - Size: 2D FFT Size 4096x86-64znver13K6K9K12K15KSE +/- 100.95, N = 3SE +/- 76.70, N = 3SE +/- 15.38, N = 31364912481136301. (CC) gcc options: -pthread -O3 -lm

PolyBench-C

Test: 3 Matrix Multiplications

GCC 8.0Clang 6.0OpenBenchmarking.orgSeconds, Fewer Is BetterPolyBench-C 3.2Test: 3 Matrix Multiplicationsx86-64znver11530456075SE +/- 0.03, N = 3SE +/- 0.21, N = 3SE +/- 0.26, N = 3SE +/- 0.24, N = 360.6865.4562.7562.981. (CC) gcc options: -O3 -march=native

GraphicsMagick

Operation: Local Adaptive Thresholding

Clang 6.0GCC 8.0OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.19Operation: Local Adaptive Thresholdingznver1x86-6420406080100SE +/- 0.33, N = 3SE +/- 0.33, N = 3989795921. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -ldl -lpthread

ebizzy

Clang 6.0GCC 8.0OpenBenchmarking.orgRecords/s, More Is Betterebizzy 0.3znver1x86-64200K400K600K800K1000KSE +/- 17141.58, N = 6SE +/- 12350.97, N = 3SE +/- 20747.16, N = 6SE +/- 19461.21, N = 311454051076648112603211011761. (CC) gcc options: -pthread -lpthread -O3 -march=native

SQLite

Test Target: Default Test Directory

GCC 8.0Clang 6.0OpenBenchmarking.orgSeconds, Fewer Is BetterSQLite 3.8.10.2Test Target: Default Test Directoryznver1x86-64246810SE +/- 0.12, N = 6SE +/- 0.12, N = 3SE +/- 0.15, N = 6SE +/- 0.06, N = 37.167.617.487.531. (CC) gcc options: -O3 -ldl -lpthread

Bullet Physics Engine

Test: 1000 Stack

GCC 8.0Clang 6.0OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: 1000 Stackznver1x86-64246810SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.03, N = 35.936.186.086.301. (CXX) g++ options: -O3 -rdynamic

SciMark

Computational Test: Monte Carlo

GCC 8.0Clang 6.0OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Monte Carlox86-64znver1120240360480600SE +/- 0.04, N = 4SE +/- 0.01, N = 4SE +/- 0.05, N = 4SE +/- 0.05, N = 4561.03555.76552.19531.381. (CC) gcc options: -O3 -lm

Bullet Physics Engine

Test: Raytests

GCC 8.0Clang 6.0OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: Raytestsznver1x86-640.72451.4492.17352.8983.6225SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 33.063.123.183.221. (CXX) g++ options: -O3 -rdynamic

TSCP

AI Chess Performance

Clang 6.0GCC 8.0OpenBenchmarking.orgNodes Per Second, More Is BetterTSCP 1.81AI Chess Performanceznver1x86-64200K400K600K800K1000KSE +/- 306.40, N = 5SE +/- 572.37, N = 5SE +/- 556.88, N = 5SE +/- 438.77, N = 59182699176588750858742511. (CC) gcc options: -O3 -march=native

Bullet Physics Engine

Test: 3000 Fall

GCC 8.0Clang 6.0OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: 3000 Fallznver1x86-641.2332.4663.6994.9326.165SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 35.275.345.345.481. (CXX) g++ options: -O3 -rdynamic

SciMark

Computational Test: Sparse Matrix Multiply

GCC 8.0Clang 6.0OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Sparse Matrix Multiplyx86-64znver15001000150020002500SE +/- 6.72, N = 4SE +/- 9.55, N = 4SE +/- 29.07, N = 4SE +/- 11.89, N = 42263.872259.952258.642190.101. (CC) gcc options: -O3 -lm

Apache Benchmark

Static Web Page Serving

GCC 8.0Clang 6.0OpenBenchmarking.orgRequests Per Second, More Is BetterApache Benchmark 2.4.7Static Web Page Servingx86-64znver12K4K6K8K10KSE +/- 22.88, N = 3SE +/- 35.54, N = 3SE +/- 121.20, N = 3SE +/- 161.49, N = 39841.309791.239663.939531.431. (CC) gcc options: -shared -fPIC -pthread -O3

Bullet Physics Engine

Test: Convex Trimesh

GCC 8.0Clang 6.0OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: Convex Trimeshznver1x86-640.30150.6030.90451.2061.5075SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 31.301.341.321.331. (CXX) g++ options: -O3 -rdynamic

Bullet Physics Engine

Test: 1000 Convex

GCC 8.0Clang 6.0OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: 1000 Convexznver1x86-641.2242.4483.6724.8966.12SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 35.285.445.315.431. (CXX) g++ options: -O3 -rdynamic

Bullet Physics Engine

Test: 136 Ragdolls

GCC 8.0Clang 6.0OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: 136 Ragdollsznver1x86-640.7381.4762.2142.9523.69SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 33.193.263.233.281. (CXX) g++ options: -O3 -rdynamic

Bullet Physics Engine

Test: Prim Trimesh

Clang 6.0GCC 8.0OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: Prim Trimeshznver1x86-640.24750.4950.74250.991.2375SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 31.091.101.101.101. (CXX) g++ options: -O3 -rdynamic

SciMark

Computational Test: Jacobi Successive Over-Relaxation

GCC 8.0Clang 6.0OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Jacobi Successive Over-Relaxationznver1x86-64400800120016002000SE +/- 0.77, N = 4SE +/- 0.28, N = 4SE +/- 0.49, N = 4SE +/- 310.16, N = 41676.621423.141424.211110.651. (CC) gcc options: -O3 -lm

SciMark

Computational Test: Dense LU Matrix Factorization

Clang 6.0GCC 8.0OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Dense LU Matrix Factorizationznver1x86-649001800270036004500SE +/- 135.63, N = 4SE +/- 17.92, N = 4SE +/- 110.38, N = 4SE +/- 177.13, N = 44034.893190.433678.863513.111. (CC) gcc options: -O3 -lm

SciMark

Computational Test: Fast Fourier Transform

GCC 8.0Clang 6.0OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Fast Fourier Transformx86-64znver150100150200250SE +/- 0.35, N = 4SE +/- 0.14, N = 4SE +/- 0.38, N = 4SE +/- 45.12, N = 4233.89231.09226.68179.291. (CC) gcc options: -O3 -lm

SciMark

Computational Test: Composite

Clang 6.0GCC 8.0OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Compositeznver1x86-64400800120016002000SE +/- 25.95, N = 4SE +/- 36.18, N = 8SE +/- 20.36, N = 4SE +/- 20.51, N = 71699.321479.531680.451579.481. (CC) gcc options: -O3 -lm

Timed HMMer Search

Pfam Database Search

Clang 6.0GCC 8.0OpenBenchmarking.orgSeconds, Fewer Is BetterTimed HMMer Search 2.3.2Pfam Database Searchznver1x86-6448121620SE +/- 0.12, N = 3SE +/- 1.28, N = 6SE +/- 0.04, N = 3SE +/- 1.74, N = 611.0912.8512.4013.651. (CC) gcc options: -O3 -pthread -lhmmer -lsquid -lm


Phoronix Test Suite v10.8.4