GCC 8.0 vs. Clang 6.0 AMD EPYC Tuning Comparison

Tests for a future article on Phoronix.

HTML result view exported from: https://openbenchmarking.org/result/1801025-AL-CLANGCC0290&obr_imw=y&grr.

ProcessorMotherboardChipsetMemoryDiskGraphicsMonitorNetworkOSKernelDesktopDisplay DriverOpenCLCompilerFile-SystemScreen ResolutionSystem LayerGCC 8.0Clang 6.0 x86-64 znver1 x86-64 znver1AMD EPYC 7601 32-Core @ 2.20GHz (64 Cores)TYAN B8026T70AE24HRAMD Device 1450126976MB280GB INTEL SSDPE21D280GAASPEED ASPEED FamilyVE228Broadcom Limited NetXtreme BCM5720 Gigabit PCIeUbuntu 17.104.13.0-21-generic (x86_64)GNOME Shell 3.26.1modesetting 1.19.5OpenCL 1.2 pocl 1.0 LLVM 5.0.0GCC 8.0.0 20171231 + clang (GCC) 8.0.0 20171231 (experimental) + LLVM 5.0.0ext41920x1080vm-other Xen 4.9.0 HypervisorClang 6.0.0 (SVN 321623) + LLVM 6.0.0svnOpenBenchmarking.orgCompiler Details- GCC 8.0: x86-64: --disable-multilib --enable-checking=release- GCC 8.0: znver1: --disable-multilib --enable-checking=release- Clang 6.0: x86-64: Optimized build; Default target: x86_64-unknown-linux-gnu; Host CPU: znver1- Clang 6.0: znver1: Optimized build; Default target: x86_64-unknown-linux-gnu; Host CPU: znver1Disk Details- NONE / data=ordered,errors=remount-ro,relatime,rw

apache: Static Web Page Servingencode-mp3: WAV To MP3encode-flac: WAV To FLACbullet: Convex Trimeshbullet: Prim Trimeshbullet: 136 Ragdollsbullet: 1000 Convexbullet: 1000 Stackbullet: 3000 Fallbullet: Raytestsc-ray: Total Timeebizzy: himeno: Poisson Pressure Solvergraphics-magick: Local Adaptive Thresholdinggraphics-magick: HWB Color Spacegraphics-magick: Sharpengraphics-magick: Blurtscp: AI Chess Performancescimark2: Jacobi Successive Over-Relaxationscimark2: Dense LU Matrix Factorizationscimark2: Sparse Matrix Multiplyscimark2: Fast Fourier Transformscimark2: Monte Carloscimark2: Compositehmmer: Pfam Database Searchfftw: Float + SSE - 2D FFT Size 4096fftw: Stock - 2D FFT Size 4096polybench-c: 3 Matrix Multiplicationssqlite: Default Test DirectoryGCC 8.0Clang 6.0 x86-64 znver1 x86-64 znver19841.3011.107.121.341.103.265.446.185.343.123.931126032949.19921771571168742511423.143513.112263.87233.89561.031579.4813.654959.7360.687.619791.2310.817.451.301.103.195.285.935.273.063.371101176935.64951861651238750851676.623678.862259.95231.09555.761680.4512.40136305627.8365.457.169531.4311.337.941.331.103.285.436.305.483.224.5310766481032.71971501311019176581110.653190.432190.10179.29531.381479.5312.85136494660.8362.987.539663.9312.816.631.321.093.235.316.085.343.184.4811454051052.47981551361049182691424.214034.892258.64226.68552.191699.3211.09124815031.6062.757.48OpenBenchmarking.org

Apache Benchmark

Static Web Page Serving

x86-64znver1OpenBenchmarking.orgRequests Per Second, More Is BetterApache Benchmark 2.4.7Static Web Page ServingGCC 8.0Clang 6.02K4K6K8K10KSE +/- 22.88, N = 3SE +/- 161.49, N = 3SE +/- 35.54, N = 3SE +/- 121.20, N = 39841.309531.439791.239663.931. (CC) gcc options: -shared -fPIC -pthread -O3 -march=znver1

LAME MP3 Encoding

WAV To MP3

x86-64znver1OpenBenchmarking.orgSeconds, Fewer Is BetterLAME MP3 Encoding 3.99.5WAV To MP3GCC 8.0Clang 6.03691215SE +/- 0.02, N = 5SE +/- 0.00, N = 5SE +/- 0.01, N = 5SE +/- 0.01, N = 511.1011.3310.8112.811. (CC) gcc options: -O3 -ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr -pipe -march=znver1 -lm

FLAC Audio Encoding

WAV To FLAC

x86-64znver1OpenBenchmarking.orgSeconds, Fewer Is BetterFLAC Audio Encoding 1.3.1WAV To FLACGCC 8.0Clang 6.0246810SE +/- 0.01, N = 5SE +/- 0.01, N = 5SE +/- 0.01, N = 5SE +/- 0.00, N = 57.127.947.456.631. (CXX) g++ options: -O3 -march=znver1 -logg -lm

Bullet Physics Engine

Test: Convex Trimesh

x86-64znver1OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: Convex TrimeshGCC 8.0Clang 6.00.30150.6030.90451.2061.5075SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 31.341.331.301.321. (CXX) g++ options: -O3 -march=znver1 -rdynamic

Bullet Physics Engine

Test: Prim Trimesh

x86-64znver1OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: Prim TrimeshGCC 8.0Clang 6.00.24750.4950.74250.991.2375SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 31.101.101.101.091. (CXX) g++ options: -O3 -march=znver1 -rdynamic

Bullet Physics Engine

Test: 136 Ragdolls

x86-64znver1OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: 136 RagdollsGCC 8.0Clang 6.00.7381.4762.2142.9523.69SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 33.263.283.193.231. (CXX) g++ options: -O3 -march=znver1 -rdynamic

Bullet Physics Engine

Test: 1000 Convex

x86-64znver1OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: 1000 ConvexGCC 8.0Clang 6.01.2242.4483.6724.8966.12SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 35.445.435.285.311. (CXX) g++ options: -O3 -march=znver1 -rdynamic

Bullet Physics Engine

Test: 1000 Stack

x86-64znver1OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: 1000 StackGCC 8.0Clang 6.0246810SE +/- 0.02, N = 3SE +/- 0.03, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 36.186.305.936.081. (CXX) g++ options: -O3 -march=znver1 -rdynamic

Bullet Physics Engine

Test: 3000 Fall

x86-64znver1OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: 3000 FallGCC 8.0Clang 6.01.2332.4663.6994.9326.165SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 35.345.485.275.341. (CXX) g++ options: -O3 -march=znver1 -rdynamic

Bullet Physics Engine

Test: Raytests

x86-64znver1OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: RaytestsGCC 8.0Clang 6.00.72451.4492.17352.8983.6225SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 33.123.223.063.181. (CXX) g++ options: -O3 -march=znver1 -rdynamic

C-Ray

Total Time

x86-64znver1OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total TimeGCC 8.0Clang 6.01.01932.03863.05794.07725.0965SE +/- 0.00, N = 3SE +/- 0.02, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 33.934.533.374.481. (CC) gcc options: -lm -lpthread -O3 -march=znver1

ebizzy

x86-64znver1OpenBenchmarking.orgRecords/s, More Is Betterebizzy 0.3GCC 8.0Clang 6.0200K400K600K800K1000KSE +/- 20747.16, N = 6SE +/- 12350.97, N = 3SE +/- 19461.21, N = 3SE +/- 17141.58, N = 611260321076648110117611454051. (CC) gcc options: -pthread -lpthread -O3 -march=znver1 -march=native

Himeno Benchmark

Poisson Pressure Solver

x86-64znver1OpenBenchmarking.orgMFLOPS, More Is BetterHimeno Benchmark 3.0Poisson Pressure SolverGCC 8.0Clang 6.02004006008001000SE +/- 0.43, N = 3SE +/- 1.46, N = 3SE +/- 0.79, N = 3SE +/- 1.50, N = 3949.191032.71935.641052.471. (CC) gcc options: -O3 -march=znver1 -mavx2

GraphicsMagick

Operation: Local Adaptive Thresholding

x86-64znver1OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.19Operation: Local Adaptive ThresholdingGCC 8.0Clang 6.020406080100SE +/- 0.33, N = 3SE +/- 0.33, N = 3929795981. (CC) gcc options: -fopenmp -O3 -march=znver1 -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread

GraphicsMagick

Operation: HWB Color Space

x86-64znver1OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.19Operation: HWB Color SpaceGCC 8.0Clang 6.04080120160200SE +/- 1.20, N = 3SE +/- 0.88, N = 31771501861551. (CC) gcc options: -fopenmp -O3 -march=znver1 -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread

GraphicsMagick

Operation: Sharpen

x86-64znver1OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.19Operation: SharpenGCC 8.0Clang 6.04080120160200SE +/- 0.33, N = 3SE +/- 0.67, N = 31571311651361. (CC) gcc options: -fopenmp -O3 -march=znver1 -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread

GraphicsMagick

Operation: Blur

x86-64znver1OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.19Operation: BlurGCC 8.0Clang 6.0306090120150SE +/- 0.33, N = 3SE +/- 0.33, N = 31161011231041. (CC) gcc options: -fopenmp -O3 -march=znver1 -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread

TSCP

AI Chess Performance

x86-64znver1OpenBenchmarking.orgNodes Per Second, More Is BetterTSCP 1.81AI Chess PerformanceGCC 8.0Clang 6.0200K400K600K800K1000KSE +/- 438.77, N = 5SE +/- 572.37, N = 5SE +/- 556.88, N = 5SE +/- 306.40, N = 58742519176588750859182691. (CC) gcc options: -O3 -march=znver1 -march=native

SciMark

Computational Test: Jacobi Successive Over-Relaxation

x86-64znver1OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Jacobi Successive Over-RelaxationGCC 8.0Clang 6.0400800120016002000SE +/- 0.28, N = 4SE +/- 310.16, N = 4SE +/- 0.77, N = 4SE +/- 0.49, N = 41423.141110.651676.621424.211. (CC) gcc options: -O3 -march=znver1 -lm

SciMark

Computational Test: Dense LU Matrix Factorization

x86-64znver1OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Dense LU Matrix FactorizationGCC 8.0Clang 6.09001800270036004500SE +/- 177.13, N = 4SE +/- 17.92, N = 4SE +/- 110.38, N = 4SE +/- 135.63, N = 43513.113190.433678.864034.891. (CC) gcc options: -O3 -march=znver1 -lm

SciMark

Computational Test: Sparse Matrix Multiply

x86-64znver1OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Sparse Matrix MultiplyGCC 8.0Clang 6.05001000150020002500SE +/- 6.72, N = 4SE +/- 11.89, N = 4SE +/- 9.55, N = 4SE +/- 29.07, N = 42263.872190.102259.952258.641. (CC) gcc options: -O3 -march=znver1 -lm

SciMark

Computational Test: Fast Fourier Transform

x86-64znver1OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Fast Fourier TransformGCC 8.0Clang 6.050100150200250SE +/- 0.35, N = 4SE +/- 45.12, N = 4SE +/- 0.14, N = 4SE +/- 0.38, N = 4233.89179.29231.09226.681. (CC) gcc options: -O3 -march=znver1 -lm

SciMark

Computational Test: Monte Carlo

x86-64znver1OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Monte CarloGCC 8.0Clang 6.0120240360480600SE +/- 0.04, N = 4SE +/- 0.05, N = 4SE +/- 0.01, N = 4SE +/- 0.05, N = 4561.03531.38555.76552.191. (CC) gcc options: -O3 -march=znver1 -lm

SciMark

Computational Test: Composite

x86-64znver1OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: CompositeGCC 8.0Clang 6.0400800120016002000SE +/- 20.51, N = 7SE +/- 36.18, N = 8SE +/- 20.36, N = 4SE +/- 25.95, N = 41579.481479.531680.451699.321. (CC) gcc options: -O3 -march=znver1 -lm

Timed HMMer Search

Pfam Database Search

x86-64znver1OpenBenchmarking.orgSeconds, Fewer Is BetterTimed HMMer Search 2.3.2Pfam Database SearchGCC 8.0Clang 6.048121620SE +/- 1.74, N = 6SE +/- 1.28, N = 6SE +/- 0.04, N = 3SE +/- 0.12, N = 313.6512.8512.4011.091. (CC) gcc options: -O3 -march=znver1 -pthread -lhmmer -lsquid -lm

FFTW

Build: Float + SSE - Size: 2D FFT Size 4096

znver1x86-64OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Float + SSE - Size: 2D FFT Size 4096GCC 8.0Clang 6.03K6K9K12K15KSE +/- 15.38, N = 3SE +/- 76.70, N = 3SE +/- 100.95, N = 31363012481136491. (CC) gcc options: -pthread -O3 -march=znver1 -lm

FFTW

Build: Stock - Size: 2D FFT Size 4096

x86-64znver1OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Stock - Size: 2D FFT Size 4096GCC 8.0Clang 6.012002400360048006000SE +/- 1.16, N = 3SE +/- 57.25, N = 3SE +/- 10.37, N = 3SE +/- 17.93, N = 34959.734660.835627.835031.601. (CC) gcc options: -pthread -O3 -march=znver1 -lm

PolyBench-C

Test: 3 Matrix Multiplications

x86-64znver1OpenBenchmarking.orgSeconds, Fewer Is BetterPolyBench-C 3.2Test: 3 Matrix MultiplicationsGCC 8.0Clang 6.01530456075SE +/- 0.03, N = 3SE +/- 0.24, N = 3SE +/- 0.21, N = 3SE +/- 0.26, N = 360.6862.9865.4562.751. (CC) gcc options: -O3 -march=znver1 -march=native

SQLite

Test Target: Default Test Directory

x86-64znver1OpenBenchmarking.orgSeconds, Fewer Is BetterSQLite 3.8.10.2Test Target: Default Test DirectoryGCC 8.0Clang 6.0246810SE +/- 0.12, N = 3SE +/- 0.06, N = 3SE +/- 0.12, N = 6SE +/- 0.15, N = 67.617.537.167.481. (CC) gcc options: -O3 -march=znver1 -ldl -lpthread


Phoronix Test Suite v10.8.4