GCC AMD Ryzen Zen znver1 Compiler Optimizations

AMD Ryzen 7 1800X Eight-Core testing for a future article.

HTML result view exported from: https://openbenchmarking.org/result/1703031-RI-GCCAMDRYZ39&sor&grw.

GCC AMD Ryzen Zen znver1 Compiler OptimizationsProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLVulkanCompilerFile-SystemScreen Resolution-O3-O3 -march=k8-sse3-O3 -march=bdver1-O3 -march=bdver4-O3 -march=znver1AMD Ryzen 7 1800X Eight-Core @ 3.60GHz (16 Cores)MSI X370 XPOWER GAMING TITANIUM (MS-7A31) v1.0AMD Device 145016384MB256GB INTEL SSDPEKKW256G7Sapphire AMD Radeon R9 FURY / NANO 4096MBAMD Fiji HDMI/DPDELL P2415QIntel I211 Gigabit ConnectionUbuntu 17.044.10.0-9-generic (x86_64)Unity 7.5.0X Server 1.18.4modesetting 1.18.44.5 Mesa 17.0.0- padoka PPA Gallium 0.4 (LLVM 4.0.0)1.0.39GCC 6.3.0 20161229ext43840x2160OpenBenchmarking.orgCompiler Details- --build=x86_64-linux-gnu --disable-browser-plugin --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-default-libstdcxx-abi=new --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic -v Processor Details- Scaling Governor: acpi-cpufreq performance

GCC AMD Ryzen Zen znver1 Compiler Optimizationsscimark2: Compositescimark2: Monte Carloscimark2: Fast Fourier Transformscimark2: Sparse Matrix Multiplyscimark2: Dense LU Matrix Factorizationscimark2: Jacobi Successive Over-Relaxationhint: FLOATencode-flac: WAV To FLACencode-mp3: WAV To MP3tjbench: Decompression Throughputfftw: Float + SSE - 2D FFT Size 1024hmmer: Pfam Database Searchhimeno: Poisson Pressure Solverbuild-imagemagick: Time To Compilebuild-apache: Time To Compilestockfish: Total Timejohn-the-ripper: Blowfishgraphics-magick: Blurgraphics-magick: Sharpengraphics-magick: Resizinggraphics-magick: HWB Color Spacegraphics-magick: Local Adaptive Thresholdingc-ray: Total Timettsiod-renderer: Phong Rendering With Soft-Shadow Mapping-O3-O3 -march=k8-sse3-O3 -march=bdver1-O3 -march=bdver4-O3 -march=znver11585.02738.39157.642421.083436.151171.83333762114.345.229.00178.19210357.111195.38177.3124.333615127981861872412541408.17342.431533.07727.86168.822524.933071.161172.56332568933.815.719.99178.13206287.201190.23177.9324.5136081282917315724125414012.36343.391605.68728.45150.422518.053460.151171.311194.42177.4626.48362312878355.791602.85727.88149.472519.533446.641170.731186.02178.1726.38128871588.29738.77148.462568.323310.831175.08333251163.505.708.73180.57222097.121135.46177.7326.323611128811922042522611437.64355.26OpenBenchmarking.org

SciMark

Computational Test: Composite

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Composite-O3 -march=bdver1-O3 -march=bdver4-O3 -march=znver1-O3-O3 -march=k8-sse330060090012001500SE +/- 7.57, N = 4SE +/- 6.07, N = 4SE +/- 8.26, N = 4SE +/- 7.65, N = 4SE +/- 5.25, N = 41605.681602.851588.291585.021533.07-march=bdver1-march=bdver4-march=znver1-march=k8-sse31. (CC) gcc options: -O3 -lm

SciMark

Computational Test: Monte Carlo

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Monte Carlo-O3 -march=znver1-O3-O3 -march=bdver1-O3 -march=bdver4-O3 -march=k8-sse3160320480640800SE +/- 0.14, N = 4SE +/- 0.06, N = 4SE +/- 0.13, N = 4SE +/- 0.29, N = 4SE +/- 0.14, N = 4738.77738.39728.45727.88727.86-march=znver1-march=bdver1-march=bdver4-march=k8-sse31. (CC) gcc options: -O3 -lm

SciMark

Computational Test: Fast Fourier Transform

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Fast Fourier Transform-O3 -march=k8-sse3-O3-O3 -march=bdver1-O3 -march=bdver4-O3 -march=znver14080120160200SE +/- 0.44, N = 4SE +/- 0.99, N = 4SE +/- 0.28, N = 4SE +/- 0.41, N = 4SE +/- 0.21, N = 4168.82157.64150.42149.47148.46-march=k8-sse3-march=bdver1-march=bdver4-march=znver11. (CC) gcc options: -O3 -lm

SciMark

Computational Test: Sparse Matrix Multiply

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Sparse Matrix Multiply-O3 -march=znver1-O3 -march=k8-sse3-O3 -march=bdver4-O3 -march=bdver1-O36001200180024003000SE +/- 22.07, N = 4SE +/- 14.04, N = 4SE +/- 14.04, N = 4SE +/- 25.26, N = 4SE +/- 7.04, N = 42568.322524.932519.532518.052421.08-march=znver1-march=k8-sse3-march=bdver4-march=bdver11. (CC) gcc options: -O3 -lm

SciMark

Computational Test: Dense LU Matrix Factorization

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Dense LU Matrix Factorization-O3 -march=bdver1-O3 -march=bdver4-O3-O3 -march=znver1-O3 -march=k8-sse37001400210028003500SE +/- 18.20, N = 4SE +/- 21.23, N = 4SE +/- 36.57, N = 4SE +/- 47.79, N = 4SE +/- 20.09, N = 43460.153446.643436.153310.833071.16-march=bdver1-march=bdver4-march=znver1-march=k8-sse31. (CC) gcc options: -O3 -lm

SciMark

Computational Test: Jacobi Successive Over-Relaxation

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Jacobi Successive Over-Relaxation-O3 -march=znver1-O3 -march=k8-sse3-O3-O3 -march=bdver1-O3 -march=bdver430060090012001500SE +/- 0.14, N = 4SE +/- 1.23, N = 4SE +/- 1.51, N = 4SE +/- 1.02, N = 4SE +/- 0.71, N = 41175.081172.561171.831171.311170.73-march=znver1-march=k8-sse3-march=bdver1-march=bdver41. (CC) gcc options: -O3 -lm

Hierarchical INTegration

Test: FLOAT

OpenBenchmarking.orgQUIPs, More Is BetterHierarchical INTegration 1.0Test: FLOAT-O3-O3 -march=znver1-O3 -march=k8-sse370M140M210M280M350MSE +/- 1715194.48, N = 3SE +/- 3555529.86, N = 3SE +/- 120531.27, N = 3333762114.34333251163.50332568933.81-march=znver1-march=k8-sse31. (CC) gcc options: -O3 -lm

FLAC Audio Encoding

WAV To FLAC

OpenBenchmarking.orgSeconds, Fewer Is BetterFLAC Audio Encoding 1.3.1WAV To FLAC-O3-O3 -march=znver1-O3 -march=k8-sse31.28482.56963.85445.13926.424SE +/- 0.00, N = 5SE +/- 0.00, N = 5SE +/- 0.00, N = 55.225.705.71-march=znver1-march=k8-sse31. (CXX) g++ options: -O3 -fvisibility=hidden -logg -lm

LAME MP3 Encoding

WAV To MP3

OpenBenchmarking.orgSeconds, Fewer Is BetterLAME MP3 Encoding 3.99.3WAV To MP3-O3 -march=znver1-O3-O3 -march=k8-sse33691215SE +/- 0.01, N = 5SE +/- 0.02, N = 5SE +/- 0.01, N = 58.739.009.99-march=znver1-march=k8-sse31. (CC) gcc options: -O3 -ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr -pipe -lm

libjpeg-turbo tjbench

Test: Decompression Throughput

OpenBenchmarking.orgMegapixels/sec, More Is Betterlibjpeg-turbo tjbench 1.5.1Test: Decompression Throughput-O3 -march=znver1-O3-O3 -march=k8-sse34080120160200SE +/- 1.16, N = 3SE +/- 0.22, N = 3SE +/- 0.19, N = 3180.57178.19178.13-march=znver1-march=k8-sse31. (CC) gcc options: -O3 -lm

FFTW

Build: Float + SSE - Size: 2D FFT Size 1024

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.4Build: Float + SSE - Size: 2D FFT Size 1024-O3 -march=znver1-O3-O3 -march=k8-sse35K10K15K20K25KSE +/- 80.85, N = 5SE +/- 94.45, N = 5SE +/- 117.99, N = 5222092103520628-march=znver1-march=k8-sse31. (CC) gcc options: -O3 -lm

Timed HMMer Search

Pfam Database Search

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed HMMer Search 2.3.2Pfam Database Search-O3-O3 -march=znver1-O3 -march=k8-sse3246810SE +/- 0.04, N = 3SE +/- 0.02, N = 3SE +/- 0.02, N = 37.117.127.20-march=znver1-march=k8-sse31. (CC) gcc options: -O3 -pthread -lhmmer -lsquid -lm

Himeno Benchmark

Poisson Pressure Solver

OpenBenchmarking.orgMFLOPS, More Is BetterHimeno Benchmark 3.0Poisson Pressure Solver-O3-O3 -march=bdver1-O3 -march=k8-sse3-O3 -march=bdver4-O3 -march=znver130060090012001500SE +/- 0.54, N = 3SE +/- 0.95, N = 3SE +/- 0.85, N = 3SE +/- 0.53, N = 3SE +/- 0.39, N = 31195.381194.421190.231186.021135.46-march=bdver1-march=k8-sse3-march=bdver4-march=znver11. (CC) gcc options: -O3 -mavx2

Timed ImageMagick Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed ImageMagick Compilation 6.9.0Time To Compile-O3-O3 -march=bdver1-O3 -march=znver1-O3 -march=k8-sse3-O3 -march=bdver44080120160200SE +/- 0.26, N = 3SE +/- 0.18, N = 3SE +/- 0.93, N = 3SE +/- 0.33, N = 3SE +/- 0.28, N = 3177.31177.46177.73177.93178.17

Timed Apache Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Apache Compilation 2.4.7Time To Compile-O3-O3 -march=k8-sse3-O3 -march=znver1-O3 -march=bdver4-O3 -march=bdver1612182430SE +/- 0.15, N = 3SE +/- 0.20, N = 3SE +/- 0.03, N = 3SE +/- 0.03, N = 3SE +/- 0.21, N = 324.3324.5126.3226.3826.48

Stockfish

Total Time

OpenBenchmarking.orgms, Fewer Is BetterStockfish 2014-11-26Total Time-O3 -march=k8-sse3-O3 -march=znver1-O3-O3 -march=bdver18001600240032004000SE +/- 5.46, N = 3SE +/- 1.00, N = 3SE +/- 0.88, N = 3SE +/- 1.45, N = 33608361136153623-march=k8-sse3-march=znver1-march=bdver11. (CXX) g++ options: -lpthread -O3 -fno-exceptions -fno-rtti -ansi -pedantic -msse -msse3 -mpopcnt -flto

John The Ripper

Test: Blowfish

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 1.8.0Test: Blowfish-O3 -march=bdver4-O3 -march=znver1-O3 -march=bdver1-O3 -march=k8-sse3-O33K6K9K12K15KSE +/- 7.77, N = 3SE +/- 13.04, N = 3SE +/- 13.04, N = 3SE +/- 23.33, N = 3SE +/- 104.33, N = 312887128811287812829127981. (CC) gcc options: -fopenmp -lcrypt

GraphicsMagick

Operation: Blur

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.19Operation: Blur-O3 -march=znver1-O3-O3 -march=k8-sse34080120160200SE +/- 0.58, N = 3SE +/- 0.67, N = 3SE +/- 0.33, N = 3192186173-march=znver1-march=k8-sse31. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lgomp -lpthread

GraphicsMagick

Operation: Sharpen

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.19Operation: Sharpen-O3 -march=znver1-O3-O3 -march=k8-sse34080120160200204187157-march=znver1-march=k8-sse31. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lgomp -lpthread

GraphicsMagick

Operation: Resizing

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.19Operation: Resizing-O3 -march=znver1-O3 -march=k8-sse3-O360120180240300252241241-march=znver1-march=k8-sse31. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lgomp -lpthread

GraphicsMagick

Operation: HWB Color Space

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.19Operation: HWB Color Space-O3 -march=znver1-O3 -march=k8-sse3-O360120180240300261254254-march=znver1-march=k8-sse31. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lgomp -lpthread

GraphicsMagick

Operation: Local Adaptive Thresholding

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.19Operation: Local Adaptive Thresholding-O3 -march=znver1-O3 -march=k8-sse3-O3306090120150143140140-march=znver1-march=k8-sse31. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lgomp -lpthread

C-Ray

Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total Time-O3 -march=znver1-O3-O3 -march=k8-sse33691215SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 37.648.1712.36-march=znver1-march=k8-sse31. (CC) gcc options: -lm -lpthread -O3

TTSIOD 3D Renderer

Phong Rendering With Soft-Shadow Mapping

OpenBenchmarking.orgFPS, More Is BetterTTSIOD 3D Renderer 2.3aPhong Rendering With Soft-Shadow Mapping-O3 -march=bdver1-O3 -march=znver1-O3 -march=k8-sse3-O380160240320400SE +/- 1.48, N = 3SE +/- 0.60, N = 3SE +/- 0.35, N = 3SE +/- 0.71, N = 3355.79355.26343.39342.43-march=bdver1-march=znver1-march=k8-sse31. (CXX) g++ options: -O3 -fomit-frame-pointer -ffast-math -mtune=native -flto -msse -mrecip -mfpmath=sse -msse2 -mssse3 -lSDL -lstdc++


Phoronix Test Suite v10.8.4