GCC 7.2 vs. GCC 8 vs. LLVM Clang 5 vs. LLVM Clang 6 Znver1 EPYC

AMD EPYC 7601 32-Core testing with a TYAN B8026T70AE24HR, testing various compilers. Tests for a future article on Phoronix.

HTML result view exported from: https://openbenchmarking.org/result/1711053-AL-GCC72VSGC51&rdt&grr.

GCC 7.2 vs. GCC 8 vs. LLVM Clang 5 vs. LLVM Clang 6 Znver1 EPYCProcessorMotherboardChipsetMemoryDiskGraphicsMonitorNetworkOSKernelDesktopCompilerFile-SystemScreen ResolutionGCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn316959AMD EPYC 7601 32-Core @ 2.20GHz (32 Cores / 64 Threads)TYAN B8026T70AE24HRAMD Device 1450129024MB120GB Force MP500ASPEED ASPEED FamilyAcer P243WBroadcom Limited NetXtreme BCM5720 Gigabit PCIeUbuntu 17.104.13.0-16-generic (x86_64)GNOME Shell 3.26.1GCC 7.2.0ext41920x1200GCC 8.0.0 20171030Clang 5.0.0-3Clang 6.0.0-svn316959-1~exp1OpenBenchmarking.orgCompiler Details- GCC 7.2.0, GCC 8.0.0 20171030: --disable-multilib --enable-checking=release --enable-languages=c,c++Processor Details- Scaling Governor: acpi-cpufreq ondemand

GCC 7.2 vs. GCC 8 vs. LLVM Clang 5 vs. LLVM Clang 6 Znver1 EPYCredis: SETredis: GETredis: LPUSHredis: SADDredis: LPOPpgbench: Buffer Test - Single Thread - Read Onlypgbench: Buffer Test - Normal Load - Read Onlytjbench: Decompression Throughputffmpeg: H.264 HD To NTSC DVencode-mp3: WAV To MP3stockfish: Total Timesmallpt: Global Illumination Renderer; 100 Samplesprimesieve: 1e12 Prime Number Generationc-ray: Total Timehimeno: Poisson Pressure Solvergraphics-magick: Local Adaptive Thresholdinggraphics-magick: HWB Color Spacegraphics-magick: Resizinggraphics-magick: Sharpengraphics-magick: Blurttsiod-renderer: Phong Rendering With Soft-Shadow Mappingtscp: AI Chess Performancescimark2: Jacobi Successive Over-Relaxationscimark2: Dense LU Matrix Factorizationscimark2: Sparse Matrix Multiplyscimark2: Fast Fourier Transformscimark2: Monte Carloscimark2: Compositegmpbench: Total Timefftw: Float + SSE - 2D FFT Size 1024fftw: Float + SSE - 1D FFT Size 1024fftw: Stock - 2D FFT Size 1024fftw: Stock - 1D FFT Size 1024hpcg: GCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn316959192063.14198580.74193867.82189478.65202065.8410784.92308134.57141.5510.6911.204485411.913.08936.77109198174177148407.278611291683.405150.442376.61223.69194.451925.723918.8019834267636420.238539.100.73193210.49198883.33192219.37193163.62195542.1210709.13303543.17143.7910.6610.714500411.942.74942.44110199176179148407.588536551685.484757.532444.39223.06554.991933.093926.2021131268766622.908416.230.72199028.19197952.09203774.66194776.06191613.9510669.81167561.19142.2610.4112.814356411.824.26965.5411016114511541.268952841427.974938.622402.42221.42560.131910.1118762262246090.377613.470.741260800.371584704.001247939.291489239.301823824.96144.4610.4512.894429411.804.55978.7611016214611641.029332061427.864883.392534.37221.33551.211923.6319036263955852.207580.830.72OpenBenchmarking.org

Redis

Test: SET

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 3.0.1Test: SETGCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn316959300K600K900K1200K1500KSE +/- 1582.98, N = 3SE +/- 3816.59, N = 3SE +/- 3063.36, N = 6SE +/- 13710.54, N = 3192063.14193210.49199028.191260800.371. (CC) gcc options: -ggdb -rdynamic -lm -pthread

Redis

Test: GET

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 3.0.1Test: GETGCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn316959300K600K900K1200K1500KSE +/- 3940.07, N = 6SE +/- 2615.42, N = 6SE +/- 3998.05, N = 6SE +/- 24641.50, N = 3198580.74198883.33197952.091584704.001. (CC) gcc options: -ggdb -rdynamic -lm -pthread

Redis

Test: LPUSH

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 3.0.1Test: LPUSHGCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn316959300K600K900K1200K1500KSE +/- 2952.51, N = 6SE +/- 3068.09, N = 4SE +/- 2321.76, N = 3SE +/- 20238.90, N = 6193867.82192219.37203774.661247939.291. (CC) gcc options: -ggdb -rdynamic -lm -pthread

Redis

Test: SADD

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 3.0.1Test: SADDGCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn316959300K600K900K1200K1500KSE +/- 303.66, N = 3SE +/- 2485.81, N = 3SE +/- 2862.24, N = 3SE +/- 43532.25, N = 6189478.65193163.62194776.061489239.301. (CC) gcc options: -ggdb -rdynamic -lm -pthread

Redis

Test: LPOP

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 3.0.1Test: LPOPGCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn316959400K800K1200K1600K2000KSE +/- 3741.75, N = 3SE +/- 3274.37, N = 6SE +/- 1726.15, N = 3SE +/- 97746.43, N = 6202065.84195542.12191613.951823824.961. (CC) gcc options: -ggdb -rdynamic -lm -pthread

PostgreSQL pgbench

Scaling: Buffer Test - Test: Single Thread - Mode: Read Only

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 10.0Scaling: Buffer Test - Test: Single Thread - Mode: Read OnlyGCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.02K4K6K8K10KSE +/- 140.64, N = 3SE +/- 47.06, N = 3SE +/- 58.33, N = 310784.9210709.1310669.81-shared-shared-lpgcommon -lpgport -lpthread -lrt -lcrypt -ldl -lm1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=znver1 -fPIC

PostgreSQL pgbench

Scaling: Buffer Test - Test: Normal Load - Mode: Read Only

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 10.0Scaling: Buffer Test - Test: Normal Load - Mode: Read OnlyGCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.070K140K210K280K350KSE +/- 1439.36, N = 3SE +/- 3027.80, N = 3SE +/- 15544.05, N = 6308134.57303543.17167561.19-shared-shared-lpgcommon -lpgport -lpthread -lrt -lcrypt -ldl -lm1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=znver1 -fPIC

libjpeg-turbo tjbench

Test: Decompression Throughput

OpenBenchmarking.orgMegapixels/sec, More Is Betterlibjpeg-turbo tjbench 1.5.1Test: Decompression ThroughputGCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn316959306090120150SE +/- 0.03, N = 3SE +/- 0.02, N = 3SE +/- 0.09, N = 3SE +/- 0.02, N = 3141.55143.79142.26144.461. (CC) gcc options: -O3 -march=znver1 -lm

FFmpeg

H.264 HD To NTSC DV

OpenBenchmarking.orgSeconds, Fewer Is BetterFFmpeg 3.3.3H.264 HD To NTSC DVGCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn3169593691215SE +/- 0.04, N = 3SE +/- 0.07, N = 3SE +/- 0.11, N = 3SE +/- 0.04, N = 310.6910.6610.4110.45-fno-tree-vectorize-fno-tree-vectorize-Qunused-arguments -mstack-alignment=16-Qunused-arguments -mstack-alignment=161. (CC) gcc options: -lavdevice -lavfilter -lavformat -lavcodec -lswresample -lswscale -lavutil -ldl -lxcb -lxcb-xfixes -lxcb-shape -lasound -lm -llzma -pthread -O3 -march=znver1 -std=c11 -fomit-frame-pointer -fno-math-errno -fno-signed-zeros -MMD -MF -MT

LAME MP3 Encoding

WAV To MP3

OpenBenchmarking.orgSeconds, Fewer Is BetterLAME MP3 Encoding 3.99.5WAV To MP3GCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn3169593691215SE +/- 0.01, N = 5SE +/- 0.00, N = 5SE +/- 0.00, N = 5SE +/- 0.01, N = 511.2010.7112.8112.891. (CC) gcc options: -O3 -ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr -pipe -march=znver1 -lm

Stockfish

Total Time

OpenBenchmarking.orgms, Fewer Is BetterStockfish 2014-11-26Total TimeGCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn31695910002000300040005000SE +/- 56.11, N = 3SE +/- 55.17, N = 3SE +/- 4.04, N = 3SE +/- 44.52, N = 34485450043564429-flto-flto1. (CXX) g++ options: -lpthread -O3 -march=znver1 -fno-exceptions -fno-rtti -ansi -pedantic -msse -msse3 -mpopcnt

Smallpt

Global Illumination Renderer; 100 Samples

OpenBenchmarking.orgSeconds, Fewer Is BetterSmallpt 1.0Global Illumination Renderer; 100 SamplesGCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn3169590.91.82.73.64.5SE +/- 0.17, N = 6SE +/- 0.17, N = 644441. (CXX) g++ options: -fopenmp -O3 -march=znver1

Primesieve

1e12 Prime Number Generation

OpenBenchmarking.orgSeconds, Fewer Is BetterPrimesieve 6.21e12 Prime Number GenerationGCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn3169593691215SE +/- 0.03, N = 3SE +/- 0.04, N = 3SE +/- 0.02, N = 3SE +/- 0.04, N = 311.9111.9411.8211.801. (CXX) g++ options: -O3 -march=znver1 -rdynamic -lpthread

C-Ray

Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total TimeGCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn3169591.02382.04763.07144.09525.119SE +/- 0.06, N = 3SE +/- 0.07, N = 6SE +/- 0.04, N = 3SE +/- 0.02, N = 33.082.744.264.551. (CC) gcc options: -lm -lpthread -O3 -march=znver1

Himeno Benchmark

Poisson Pressure Solver

OpenBenchmarking.orgMFLOPS, More Is BetterHimeno Benchmark 3.0Poisson Pressure SolverGCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn3169592004006008001000SE +/- 21.77, N = 6SE +/- 1.81, N = 3SE +/- 2.40, N = 3SE +/- 3.60, N = 3936.77942.44965.54978.761. (CC) gcc options: -O3 -march=znver1 -mavx2

GraphicsMagick

Operation: Local Adaptive Thresholding

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.19Operation: Local Adaptive ThresholdingGCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn31695920406080100SE +/- 1.20, N = 3109110110110-ldl-ldl-lgomp-lgomp1. (CC) gcc options: -fopenmp -O3 -march=znver1 -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lX11 -llzma -lz -lm -lpthread

GraphicsMagick

Operation: HWB Color Space

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.19Operation: HWB Color SpaceGCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn3169594080120160200198199161162-ldl-ldl-lgomp-lgomp1. (CC) gcc options: -fopenmp -O3 -march=znver1 -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lX11 -llzma -lz -lm -lpthread

GraphicsMagick

Operation: Resizing

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.19Operation: ResizingGCC 7.2.0GCC 8.0.0 201710304080120160200SE +/- 0.67, N = 3SE +/- 0.88, N = 31741761. (CC) gcc options: -fopenmp -O3 -march=znver1 -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lX11 -llzma -lz -lm -ldl -lpthread

GraphicsMagick

Operation: Sharpen

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.19Operation: SharpenGCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn3169594080120160200SE +/- 0.33, N = 3SE +/- 0.67, N = 3177179145146-ldl-ldl-lgomp-lgomp1. (CC) gcc options: -fopenmp -O3 -march=znver1 -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lX11 -llzma -lz -lm -lpthread

GraphicsMagick

Operation: Blur

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.19Operation: BlurGCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn316959306090120150SE +/- 0.67, N = 3SE +/- 1.00, N = 3148148115116-ldl-ldl-lgomp-lgomp1. (CC) gcc options: -fopenmp -O3 -march=znver1 -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lX11 -llzma -lz -lm -lpthread

TTSIOD 3D Renderer

Phong Rendering With Soft-Shadow Mapping

OpenBenchmarking.orgFPS, More Is BetterTTSIOD 3D Renderer 2.3aPhong Rendering With Soft-Shadow MappingGCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn31695990180270360450SE +/- 6.30, N = 6SE +/- 6.79, N = 4SE +/- 0.05, N = 3SE +/- 0.06, N = 3407.27407.5841.2641.021. (CXX) g++ options: -O3 -march=znver1 -fomit-frame-pointer -ffast-math -mtune=native -flto -msse -mrecip -mfpmath=sse -msse2 -mssse3 -lSDL -lstdc++

TSCP

AI Chess Performance

OpenBenchmarking.orgNodes Per Second, More Is BetterTSCP 1.81AI Chess PerformanceGCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn316959200K400K600K800K1000KSE +/- 329.95, N = 5SE +/- 264.40, N = 5SE +/- 544.63, N = 5SE +/- 316.00, N = 58611298536558952849332061. (CC) gcc options: -O3 -march=znver1 -march=native

SciMark

Computational Test: Jacobi Successive Over-Relaxation

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Jacobi Successive Over-RelaxationGCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn316959400800120016002000SE +/- 0.70, N = 4SE +/- 0.66, N = 4SE +/- 0.12, N = 4SE +/- 0.24, N = 41683.401685.481427.971427.861. (CC) gcc options: -O3 -march=znver1 -lm

SciMark

Computational Test: Dense LU Matrix Factorization

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Dense LU Matrix FactorizationGCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn31695911002200330044005500SE +/- 37.84, N = 4SE +/- 31.33, N = 4SE +/- 38.03, N = 4SE +/- 14.77, N = 45150.444757.534938.624883.391. (CC) gcc options: -O3 -march=znver1 -lm

SciMark

Computational Test: Sparse Matrix Multiply

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Sparse Matrix MultiplyGCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn3169595001000150020002500SE +/- 13.62, N = 4SE +/- 9.19, N = 4SE +/- 11.52, N = 4SE +/- 13.69, N = 42376.612444.392402.422534.371. (CC) gcc options: -O3 -march=znver1 -lm

SciMark

Computational Test: Fast Fourier Transform

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Fast Fourier TransformGCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn31695950100150200250SE +/- 0.12, N = 4SE +/- 0.07, N = 4SE +/- 0.15, N = 4SE +/- 0.02, N = 4223.69223.06221.42221.331. (CC) gcc options: -O3 -march=znver1 -lm

SciMark

Computational Test: Monte Carlo

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Monte CarloGCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn316959120240360480600SE +/- 0.01, N = 4SE +/- 0.07, N = 4SE +/- 0.06, N = 4SE +/- 0.19, N = 4194.45554.99560.13551.211. (CC) gcc options: -O3 -march=znver1 -lm

SciMark

Computational Test: Composite

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: CompositeGCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn316959400800120016002000SE +/- 8.62, N = 4SE +/- 7.80, N = 4SE +/- 9.03, N = 4SE +/- 5.14, N = 41925.721933.091910.111923.631. (CC) gcc options: -O3 -march=znver1 -lm

GNU GMP GMPbench

Total Time

OpenBenchmarking.orgGMPbench Score, More Is BetterGNU GMP GMPbench 6.1.2Total TimeGCC 7.2.0GCC 8.0.0 2017103080016002400320040003918.803926.201. (CC) gcc options: -O3 -march=znver1 -lm

FFTW

Build: Float + SSE - Size: 2D FFT Size 1024

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Float + SSE - Size: 2D FFT Size 1024GCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn3169595K10K15K20K25KSE +/- 270.69, N = 3SE +/- 57.10, N = 3SE +/- 65.36, N = 3SE +/- 25.56, N = 3198342113118762190361. (CC) gcc options: -pthread -O3 -march=znver1 -lm

FFTW

Build: Float + SSE - Size: 1D FFT Size 1024

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Float + SSE - Size: 1D FFT Size 1024GCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn3169596K12K18K24K30KSE +/- 26.61, N = 3SE +/- 120.67, N = 3SE +/- 35.53, N = 3SE +/- 28.81, N = 3267632687626224263951. (CC) gcc options: -pthread -O3 -march=znver1 -lm

FFTW

Build: Stock - Size: 2D FFT Size 1024

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Stock - Size: 2D FFT Size 1024GCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn31695914002800420056007000SE +/- 104.31, N = 4SE +/- 33.11, N = 3SE +/- 10.25, N = 3SE +/- 96.36, N = 46420.236622.906090.375852.201. (CC) gcc options: -pthread -O3 -march=znver1 -lm

FFTW

Build: Stock - Size: 1D FFT Size 1024

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Stock - Size: 1D FFT Size 1024GCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn3169592K4K6K8K10KSE +/- 4.79, N = 3SE +/- 5.10, N = 3SE +/- 8.50, N = 3SE +/- 14.03, N = 38539.108416.237613.477580.831. (CC) gcc options: -pthread -O3 -march=znver1 -lm

High Performance Conjugate Gradient

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.0GCC 7.2.0GCC 8.0.0 20171030LLVM Clang 5.0LLVM Clang 6.0 svn3169590.16650.3330.49950.6660.8325SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 30.730.720.740.72


Phoronix Test Suite v10.8.5