AMD EPYC Rome Compiler Benchmarks AMD AOCC 2.0, GCC, LLVM Clang compiler benchmarks on EPYC 7742. Tests by Michael Larabel for a future article. GCC 9.1.0: Processor: 2 x AMD EPYC 7742 64-Core @ 2.25GHz (128 Cores / 256 Threads), Motherboard: AMD DAYTONA_X (RDY1001C BIOS), Chipset: AMD Device 1480, Memory: 516096MB, Disk: 280GB INTEL SSDPED1D280GA + 6 x 3841GB Micron_9300_MTFDHAL3T8TDP + 256GB Micron_1100_MTFD, Graphics: ASPEED, Monitor: VE228, Network: 2 x Mellanox MT27710 OS: Ubuntu 19.04, Kernel: 5.2.0-050200rc7-generic (x86_64) 20190630, Desktop: GNOME Shell 3.32.1, Display Server: X Server 1.20.4, Display Driver: modesetting 1.20.4, Compiler: GCC 9.1.0, File-System: ext4, Screen Resolution: 1920x1080 GCC 10.0 Git: Processor: 2 x AMD EPYC 7742 64-Core @ 2.25GHz (128 Cores / 256 Threads), Motherboard: AMD DAYTONA_X (RDY1001C BIOS), Chipset: AMD Device 1480, Memory: 516096MB, Disk: 280GB INTEL SSDPED1D280GA + 6 x 3841GB Micron_9300_MTFDHAL3T8TDP + 256GB Micron_1100_MTFD, Graphics: ASPEED, Monitor: VE228, Network: 2 x Mellanox MT27710 OS: Ubuntu 19.04, Kernel: 5.2.0-050200rc7-generic (x86_64) 20190630, Desktop: GNOME Shell 3.32.1, Display Server: X Server 1.20.4, Display Driver: modesetting 1.20.4, Compiler: GCC 10.0.0 20190804, File-System: ext4, Screen Resolution: 1920x1080 LLVM Clang 9.0 SVN: Processor: 2 x AMD EPYC 7742 64-Core @ 2.25GHz (128 Cores / 256 Threads), Motherboard: AMD DAYTONA_X (RDY1001C BIOS), Chipset: AMD Device 1480, Memory: 516096MB, Disk: 280GB INTEL SSDPED1D280GA + 6 x 3841GB Micron_9300_MTFDHAL3T8TDP + 256GB Micron_1100_MTFD, Graphics: ASPEED, Monitor: VE228, Network: 2 x Mellanox MT27710 OS: Ubuntu 19.04, Kernel: 5.2.0-050200rc7-generic (x86_64) 20190630, Desktop: GNOME Shell 3.32.1, Display Server: X Server 1.20.4, Display Driver: modesetting 1.20.4, Compiler: Clang 9.0.0-svn364739-1~exp1+0~20190701101552.184~1.gbp124358, File-System: ext4, Screen Resolution: 1920x1080 AOCC 2.0: Processor: 2 x AMD EPYC 7742 64-Core @ 2.25GHz (128 Cores / 256 Threads), Motherboard: AMD DAYTONA_X (RDY1001C BIOS), Chipset: AMD Device 1480, Memory: 516096MB, Disk: 280GB INTEL SSDPED1D280GA + 6 x 3841GB Micron_9300_MTFDHAL3T8TDP + 256GB Micron_1100_MTFD, Graphics: ASPEED, Monitor: VE228, Network: 2 x Mellanox MT27710 OS: Ubuntu 19.04, Kernel: 5.2.0-050200rc7-generic (x86_64) 20190630, Desktop: GNOME Shell 3.32.1, Display Server: X Server 1.20.4, Display Driver: modesetting 1.20.4, Compiler: Clang 8.0.0, File-System: ext4, Screen Resolution: 1920x1080 Sockperf 3.4 Test: Throughput Messages Per Second > Higher Is Better GCC 9.1.0 .... 477578 |======================================================== GCC 10.0 Git . 480415 |======================================================== Sockperf 3.4 Test: Latency Ping Pong usec < Lower Is Better GCC 9.1.0 .... 5.39 |========================================================== GCC 10.0 Git . 5.39 |========================================================== Sockperf 3.4 Test: Latency Under Load usec < Lower Is Better GCC 9.1.0 .... 22.42 |===================================================== GCC 10.0 Git . 24.12 |========================================================= High Performance Conjugate Gradient 3.0 GFLOP/s > Higher Is Better GCC 9.1.0 .......... 0.36 |==================================================== GCC 10.0 Git ....... 0.36 |==================================================== LLVM Clang 9.0 SVN . 0.35 |=================================================== AOCC 2.0 ........... 0.33 |================================================ lzbench 2017-08-08 Test: XZ 0 - Process: Compression MB/s > Higher Is Better GCC 9.1.0 .... 30 |============================================================ GCC 10.0 Git . 30 |============================================================ lzbench 2017-08-08 Test: XZ 0 - Process: Decompression MB/s > Higher Is Better GCC 9.1.0 .... 87 |============================================================ GCC 10.0 Git . 86 |=========================================================== lzbench 2017-08-08 Test: Zstd 1 - Process: Compression MB/s > Higher Is Better GCC 9.1.0 .... 364 |=========================================================== GCC 10.0 Git . 362 |=========================================================== lzbench 2017-08-08 Test: Zstd 1 - Process: Decompression MB/s > Higher Is Better GCC 9.1.0 .... 982 |======================================================== GCC 10.0 Git . 1011 |========================================================== lzbench 2017-08-08 Test: Brotli 0 - Process: Compression MB/s > Higher Is Better GCC 9.1.0 .... 389 |=========================================================== GCC 10.0 Git . 385 |========================================================== GCC 10.0 Git . 386 |=========================================================== lzbench 2017-08-08 Test: Brotli 0 - Process: Decompression MB/s > Higher Is Better GCC 9.1.0 . 446 |============================================================== lzbench 2017-08-08 Test: Libdeflate 1 - Process: Compression MB/s > Higher Is Better GCC 9.1.0 .... 197 |=========================================================== GCC 10.0 Git . 194 |========================================================== lzbench 2017-08-08 Test: Libdeflate 1 - Process: Decompression MB/s > Higher Is Better GCC 9.1.0 .... 880 |=========================================================== GCC 10.0 Git . 876 |=========================================================== FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 2048 Mflops > Higher Is Better GCC 9.1.0 .......... 8946.23 |================================================= GCC 10.0 Git ....... 8958.53 |================================================= LLVM Clang 9.0 SVN . 8273.10 |============================================= AOCC 2.0 ........... 8007.57 |============================================ FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 4096 Mflops > Higher Is Better GCC 9.1.0 .......... 8808.93 |================================================= GCC 10.0 Git ....... 8816.73 |================================================= LLVM Clang 9.0 SVN . 8258.37 |============================================== AOCC 2.0 ........... 7503.57 |========================================== FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 2048 Mflops > Higher Is Better GCC 9.1.0 .......... 7381.00 |================================================= GCC 10.0 Git ....... 7288.80 |================================================ LLVM Clang 9.0 SVN . 6627.77 |============================================ AOCC 2.0 ........... 6462.33 |=========================================== FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 4096 Mflops > Higher Is Better GCC 9.1.0 .......... 6327.20 |================================================= GCC 10.0 Git ....... 6289.30 |================================================= LLVM Clang 9.0 SVN . 5842.73 |============================================= AOCC 2.0 ........... 5593.44 |=========================================== SciMark 2.0 Computational Test: Composite Mflops > Higher Is Better GCC 9.1.0 .......... 2834.06 |================================================ GCC 10.0 Git ....... 2828.01 |================================================ LLVM Clang 9.0 SVN . 2880.56 |================================================= AOCC 2.0 ........... 2730.13 |============================================== SciMark 2.0 Computational Test: Monte Carlo Mflops > Higher Is Better GCC 9.1.0 .......... 612.17 |================================================= GCC 10.0 Git ....... 610.98 |================================================= LLVM Clang 9.0 SVN . 621.05 |================================================== AOCC 2.0 ........... 605.88 |================================================= SciMark 2.0 Computational Test: Fast Fourier Transform Mflops > Higher Is Better GCC 9.1.0 .......... 203.68 |============================================= GCC 10.0 Git ....... 202.71 |============================================= LLVM Clang 9.0 SVN . 224.52 |================================================== AOCC 2.0 ........... 178.57 |======================================== SciMark 2.0 Computational Test: Sparse Matrix Multiply Mflops > Higher Is Better GCC 9.1.0 .......... 2741.97 |======================================== GCC 10.0 Git ....... 2763.50 |======================================== LLVM Clang 9.0 SVN . 3382.93 |================================================= AOCC 2.0 ........... 2736.94 |======================================== SciMark 2.0 Computational Test: Dense LU Matrix Factorization Mflops > Higher Is Better GCC 9.1.0 .......... 8811.17 |================================================= GCC 10.0 Git ....... 8762.75 |================================================= LLVM Clang 9.0 SVN . 8518.52 |=============================================== AOCC 2.0 ........... 8473.11 |=============================================== SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation Mflops > Higher Is Better GCC 9.1.0 .......... 1801.31 |================================================= GCC 10.0 Git ....... 1800.11 |================================================= LLVM Clang 9.0 SVN . 1655.77 |============================================= AOCC 2.0 ........... 1656.18 |============================================= TSCP 1.81 AI Chess Performance Nodes Per Second > Higher Is Better GCC 9.1.0 .......... 1072812 |============================================== GCC 10.0 Git ....... 1040775 |============================================ LLVM Clang 9.0 SVN . 1149370 |================================================= AOCC 2.0 ........... 1093682 |=============================================== John The Ripper 1.9.0-jumbo-1 Test: Blowfish Real C/S > Higher Is Better GCC 9.1.0 .......... 148302 |======================================== GCC 10.0 Git ....... 148542 |======================================== LLVM Clang 9.0 SVN . 187652 |================================================== AOCC 2.0 ........... 187140 |================================================== MKL-DNN 2019-04-16 Harness: IP Batch 1D - Data Type: f32 ms < Lower Is Better GCC 9.1.0 .......... 1585.03 |================================================= GCC 10.0 Git ....... 946.12 |============================= LLVM Clang 9.0 SVN . 16.76 |= AOCC 2.0 ........... 11.60 | MKL-DNN 2019-04-16 Harness: IP Batch All - Data Type: f32 ms < Lower Is Better GCC 9.1.0 .......... 6593.30 |============================================== GCC 10.0 Git ....... 6961.88 |================================================= LLVM Clang 9.0 SVN . 90.66 |= AOCC 2.0 ........... 76.80 |= MKL-DNN 2019-04-16 Harness: Convolution Batch conv_3d - Data Type: f32 ms < Lower Is Better GCC 9.1.0 .......... 387.71 |============================================ GCC 10.0 Git ....... 440.31 |================================================== LLVM Clang 9.0 SVN . 2.66 | AOCC 2.0 ........... 2.71 | MKL-DNN 2019-04-16 Harness: Convolution Batch conv_all - Data Type: f32 ms < Lower Is Better GCC 9.1.0 .......... 37344.80 |=============================================== GCC 10.0 Git ....... 38147.16 |================================================ LLVM Clang 9.0 SVN . 371.95 | AOCC 2.0 ........... 370.63 | MKL-DNN 2019-04-16 Harness: Deconvolution Batch deconv_1d - Data Type: f32 ms < Lower Is Better GCC 9.1.0 .......... 818.78 |=========================================== GCC 10.0 Git ....... 955.00 |================================================== LLVM Clang 9.0 SVN . 6.08 | AOCC 2.0 ........... 5.40 | MKL-DNN 2019-04-16 Harness: Deconvolution Batch deconv_3d - Data Type: f32 ms < Lower Is Better GCC 9.1.0 .......... 588.07 |================================================== GCC 10.0 Git ....... 500.54 |=========================================== LLVM Clang 9.0 SVN . 2.74 | AOCC 2.0 ........... 2.73 | MKL-DNN 2019-04-16 Harness: Convolution Batch conv_alexnet - Data Type: f32 ms < Lower Is Better GCC 9.1.0 .......... 4136.48 |================================================= GCC 10.0 Git ....... 4171.88 |================================================= LLVM Clang 9.0 SVN . 42.49 | AOCC 2.0 ........... 42.78 |= MKL-DNN 2019-04-16 Harness: Deconvolution Batch deconv_all - Data Type: f32 ms < Lower Is Better GCC 9.1.0 .......... 208536.00 |=============================================== GCC 10.0 Git ....... 207086.00 |=============================================== LLVM Clang 9.0 SVN . 2590.09 |= AOCC 2.0 ........... 1917.89 | MKL-DNN 2019-04-16 Harness: Convolution Batch conv_googlenet_v3 - Data Type: f32 ms < Lower Is Better GCC 9.1.0 .......... 3194.83 |=============================================== GCC 10.0 Git ....... 3342.14 |================================================= LLVM Clang 9.0 SVN . 23.30 | AOCC 2.0 ........... 23.21 | SVT-AV1 0.5 1080p 8-bit YUV To AV1 Video Encode Frames Per Second > Higher Is Better GCC 9.1.0 .......... 101.07 |================================================= GCC 10.0 Git ....... 100.44 |================================================= LLVM Clang 9.0 SVN . 102.78 |================================================== SVT-HEVC 2019-02-03 1080p 8-bit YUV To HEVC Video Encode Frames Per Second > Higher Is Better GCC 9.1.0 .......... 343.29 |================================================== GCC 10.0 Git ....... 344.04 |================================================== LLVM Clang 9.0 SVN . 337.75 |================================================= SVT-VP9 2019-02-17 1080p 8-bit YUV To VP9 Video Encode Frames Per Second > Higher Is Better GCC 9.1.0 .......... 277.45 |================================================= GCC 10.0 Git ....... 283.63 |================================================== LLVM Clang 9.0 SVN . 274.55 |================================================ x264 2018-09-25 H.264 Video Encoding Frames Per Second > Higher Is Better GCC 9.1.0 .......... 153.31 |================================================= GCC 10.0 Git ....... 152.43 |================================================ LLVM Clang 9.0 SVN . 153.95 |================================================= AOCC 2.0 ........... 157.18 |================================================== x265 3.0 H.265 1080p Video Encoding Frames Per Second > Higher Is Better GCC 9.1.0 .......... 44.54 |================================================== GCC 10.0 Git ....... 44.73 |================================================== LLVM Clang 9.0 SVN . 45.37 |=================================================== AOCC 2.0 ........... 45.57 |=================================================== GraphicsMagick 1.3.30 Operation: Swirl Iterations Per Minute > Higher Is Better GCC 9.1.0 .......... 218 |===================================================== GCC 10.0 Git ....... 219 |===================================================== LLVM Clang 9.0 SVN . 216 |==================================================== AOCC 2.0 ........... 194 |=============================================== GraphicsMagick 1.3.30 Operation: Rotate Iterations Per Minute > Higher Is Better GCC 9.1.0 .......... 204 |============================================== GCC 10.0 Git ....... 202 |============================================= LLVM Clang 9.0 SVN . 237 |===================================================== AOCC 2.0 ........... 205 |============================================== GraphicsMagick 1.3.30 Operation: Sharpen Iterations Per Minute > Higher Is Better GCC 9.1.0 .......... 207 |==================================================== GCC 10.0 Git ....... 207 |==================================================== LLVM Clang 9.0 SVN . 209 |===================================================== AOCC 2.0 ........... 186 |=============================================== GraphicsMagick 1.3.30 Operation: Enhanced Iterations Per Minute > Higher Is Better GCC 9.1.0 .......... 215 |===================================================== GCC 10.0 Git ....... 213 |===================================================== LLVM Clang 9.0 SVN . 215 |===================================================== AOCC 2.0 ........... 190 |=============================================== GraphicsMagick 1.3.30 Operation: Resizing Iterations Per Minute > Higher Is Better GCC 9.1.0 .......... 102 |============================================= GCC 10.0 Git ....... 102 |============================================= LLVM Clang 9.0 SVN . 119 |===================================================== AOCC 2.0 ........... 120 |===================================================== GraphicsMagick 1.3.30 Operation: Noise-Gaussian Iterations Per Minute > Higher Is Better GCC 9.1.0 .......... 208 |================================================== GCC 10.0 Git ....... 207 |================================================== LLVM Clang 9.0 SVN . 220 |===================================================== AOCC 2.0 ........... 195 |=============================================== GraphicsMagick 1.3.30 Operation: HWB Color Space Iterations Per Minute > Higher Is Better GCC 9.1.0 .......... 228 |==================================================== GCC 10.0 Git ....... 226 |=================================================== LLVM Clang 9.0 SVN . 234 |===================================================== AOCC 2.0 ........... 202 |============================================== Coremark 1.0 CoreMark Size 666 - Iterations Per Second Iterations/Sec > Higher Is Better GCC 9.1.0 .......... 3868113.71 |============================================== GCC 10.0 Git ....... 3825301.45 |============================================= LLVM Clang 9.0 SVN . 3024508.81 |==================================== AOCC 2.0 ........... 3284059.92 |======================================= Timed LLVM Compilation 6.0.1 Time To Compile Seconds < Lower Is Better GCC 9.1.0 .......... 96.29 |================================= GCC 10.0 Git ....... 96.85 |================================= LLVM Clang 9.0 SVN . 78.86 |=========================== AOCC 2.0 ........... 146.60 |================================================== Timed PHP Compilation 7.1.9 Time To Compile Seconds < Lower Is Better GCC 9.1.0 .......... 64.62 |============== GCC 10.0 Git ....... 65.06 |============== LLVM Clang 9.0 SVN . 88.30 |=================== AOCC 2.0 ........... 230.21 |================================================== C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel Seconds < Lower Is Better GCC 9.1.0 .......... 5.72 |================================ GCC 10.0 Git ....... 5.82 |================================= LLVM Clang 9.0 SVN . 9.19 |==================================================== AOCC 2.0 ........... 9.01 |=================================================== AOBench Size: 2048 x 2048 - Total Time Seconds < Lower Is Better GCC 9.1.0 .......... 36.77 |============================================= GCC 10.0 Git ....... 35.33 |=========================================== LLVM Clang 9.0 SVN . 41.75 |=================================================== AOCC 2.0 ........... 37.02 |============================================= dav1d 0.3 Video Input: Summer Nature 4K Seconds < Lower Is Better GCC 9.1.0 .......... 11.52 |=================================================== GCC 10.0 Git ....... 11.16 |================================================= LLVM Clang 9.0 SVN . 11.56 |=================================================== AOCC 2.0 ........... 11.14 |================================================= dav1d 0.3 Video Input: Summer Nature 1080p Seconds < Lower Is Better GCC 9.1.0 .......... 4.75 |=================================================== GCC 10.0 Git ....... 4.77 |=================================================== LLVM Clang 9.0 SVN . 4.86 |==================================================== AOCC 2.0 ........... 4.77 |=================================================== libjpeg-turbo tjbench 2.0.2 Test: Decompression Throughput Megapixels/sec > Higher Is Better GCC 9.1.0 .......... 175.16 |================================================== GCC 10.0 Git ....... 175.48 |================================================== LLVM Clang 9.0 SVN . 174.13 |================================================= AOCC 2.0 ........... 175.96 |================================================== CppPerformanceBenchmarks 9 Test: Atol Seconds < Lower Is Better GCC 9.1.0 .......... 74.36 |=================================================== GCC 10.0 Git ....... 74.32 |=================================================== LLVM Clang 9.0 SVN . 74.10 |=================================================== AOCC 2.0 ........... 74.38 |=================================================== CppPerformanceBenchmarks 9 Test: Ctype Seconds < Lower Is Better GCC 9.1.0 .......... 41.66 |=================================================== GCC 10.0 Git ....... 39.33 |================================================ LLVM Clang 9.0 SVN . 36.74 |============================================= AOCC 2.0 ........... 37.81 |============================================== CppPerformanceBenchmarks 9 Test: Math Library Seconds < Lower Is Better GCC 9.1.0 .......... 343.45 |================================================== GCC 10.0 Git ....... 333.85 |================================================= LLVM Clang 9.0 SVN . 331.02 |================================================ AOCC 2.0 ........... 330.41 |================================================ CppPerformanceBenchmarks 9 Test: Random Numbers Seconds < Lower Is Better GCC 9.1.0 .......... 1599.07 |========================================= GCC 10.0 Git ....... 1627.99 |========================================== LLVM Clang 9.0 SVN . 1892.46 |================================================= AOCC 2.0 ........... 1894.02 |================================================= CppPerformanceBenchmarks 9 Test: Stepanov Vector Seconds < Lower Is Better GCC 9.1.0 .......... 99.21 |=================================================== GCC 10.0 Git ....... 97.70 |================================================== LLVM Clang 9.0 SVN . 85.89 |============================================ AOCC 2.0 ........... 89.73 |============================================== CppPerformanceBenchmarks 9 Test: Function Objects Seconds < Lower Is Better GCC 9.1.0 .......... 18.93 |=================================================== GCC 10.0 Git ....... 19.01 |=================================================== LLVM Clang 9.0 SVN . 18.82 |================================================== AOCC 2.0 ........... 18.94 |=================================================== CppPerformanceBenchmarks 9 Test: Stepanov Abstraction Seconds < Lower Is Better GCC 9.1.0 .......... 36.66 |=================================================== GCC 10.0 Git ....... 36.52 |=================================================== LLVM Clang 9.0 SVN . 33.44 |=============================================== AOCC 2.0 ........... 34.10 |=============================================== Apache Benchmark 2.4.29 Static Web Page Serving Requests Per Second > Higher Is Better GCC 9.1.0 .......... 24915.19 |================================================ GCC 10.0 Git ....... 24096.60 |============================================== LLVM Clang 9.0 SVN . 24594.66 |=============================================== AOCC 2.0 ........... 24163.31 |=============================================== Apache Siege 2.4.29 Concurrent Users: 100 Transactions Per Second > Higher Is Better GCC 9.1.0 .......... 26315.80 |================================================ GCC 10.0 Git ....... 26336.61 |================================================ LLVM Clang 9.0 SVN . 25958.51 |=============================================== Apache Siege 2.4.29 Concurrent Users: 200 Transactions Per Second > Higher Is Better GCC 9.1.0 .......... 32222.72 |================================================ GCC 10.0 Git ....... 30752.54 |============================================== LLVM Clang 9.0 SVN . 32213.59 |================================================ Apache Siege 2.4.29 Concurrent Users: 250 Transactions Per Second > Higher Is Better GCC 9.1.0 .......... 33869.93 |================================================ GCC 10.0 Git ....... 32367.31 |============================================== LLVM Clang 9.0 SVN . 33938.13 |================================================