AMD EPYC Compiler Tuning

GCC 9 compiler tuning benchmarks by Michael Larabel for a future article on Phoronix.com.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 1902194-SP-AMDEPYCCO19
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results

Limit displaying results to tests within:

Audio Encoding 2 Tests
Bioinformatics 2 Tests
Chess Test Suite 2 Tests
Timed Code Compilation 3 Tests
C/C++ Compiler Tests 23 Tests
CPU Massive 21 Tests
Creator Workloads 11 Tests
Encoding 7 Tests
HPC - High Performance Computing 3 Tests
Imaging 2 Tests
Common Kernel Benchmarks 2 Tests
Multi-Core 15 Tests
Programmer / Developer System Benchmarks 4 Tests
Renderers 2 Tests
Scientific Computing 3 Tests
Server CPU Tests 12 Tests
Single-Threaded 5 Tests
Video Encoding 5 Tests
Common Workstation Benchmarks 2 Tests

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Geometric Means Per-Suite/Category
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs
Condense Test Profiles With Multiple Version Results Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
-O0
February 15 2019
  5 Hours, 52 Minutes
-Og
February 19 2019
  5 Hours, 57 Minutes
-O1
February 16 2019
  6 Hours, 33 Minutes
-O2
February 16 2019
  6 Hours, 13 Minutes
-O2 -ftree-vectorize -ftree-slp-vectorize
February 18 2019
  5 Hours, 47 Minutes
-O2 -march=znver1
February 17 2019
  6 Hours, 29 Minutes
-O2 -flto
February 18 2019
  4 Hours, 46 Minutes
-O3
February 16 2019
  7 Hours, 15 Minutes
-O3 -march=znver1
February 15 2019
  4 Hours, 53 Minutes
-O3 -march=znver1 -flto
February 18 2019
  7 Hours, 2 Minutes
-Ofast -march=znver1
February 17 2019
  6 Hours, 52 Minutes
Invert Hiding All Results Option
  6 Hours, 9 Minutes

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


AMD EPYC Compiler Tuning GCC 9 compiler tuning benchmarks by Michael Larabel for a future article on Phoronix.com. -O0: Processor: 2 x AMD EPYC 7601 32-Core (64 Cores / 128 Threads), Motherboard: Dell 02MJ3T (1.2.5 BIOS), Chipset: AMD Family 17h, Memory: 16 x 32 GB DDR4-2400MT/s 36ASF4G72PZ-2G6D2, Disk: 120GB SSDSCKJB120G7R + 20 x 500GB Samsung SSD 860, Graphics: Matrox G200eW3, Monitor: VE228, Network: 2 x Broadcom BCM57416 NetXtreme-E 10GBase-T RDMA + 2 x Broadcom NetXtreme BCM5720 PCIe OS: Ubuntu 18.04, Kernel: 5.0.0-050000rc6-generic (x86_64) 20190210, Desktop: GNOME Shell 3.28.3, Display Server: X Server, Compiler: GCC 9.0.1 20190210, File-System: ext4, Screen Resolution: 1600x1200 -Og: Processor: 2 x AMD EPYC 7601 32-Core (64 Cores / 128 Threads), Motherboard: Dell 02MJ3T (1.2.5 BIOS), Chipset: AMD Family 17h, Memory: 16 x 32 GB DDR4-2400MT/s 36ASF4G72PZ-2G6D2, Disk: 120GB SSDSCKJB120G7R + 20 x 500GB Samsung SSD 860, Graphics: Matrox G200eW3, Monitor: VE228, Network: 2 x Broadcom BCM57416 NetXtreme-E 10GBase-T RDMA + 2 x Broadcom NetXtreme BCM5720 PCIe OS: Ubuntu 18.04, Kernel: 5.0.0-050000rc6-generic (x86_64) 20190210, Desktop: GNOME Shell 3.28.3, Display Server: X Server, Compiler: GCC 9.0.1 20190210, File-System: ext4, Screen Resolution: 1600x1200 -O1: Processor: 2 x AMD EPYC 7601 32-Core (64 Cores / 128 Threads), Motherboard: Dell 02MJ3T (1.2.5 BIOS), Chipset: AMD Family 17h, Memory: 16 x 32 GB DDR4-2400MT/s 36ASF4G72PZ-2G6D2, Disk: 120GB SSDSCKJB120G7R + 20 x 500GB Samsung SSD 860, Graphics: Matrox G200eW3, Monitor: VE228, Network: 2 x Broadcom BCM57416 NetXtreme-E 10GBase-T RDMA + 2 x Broadcom NetXtreme BCM5720 PCIe OS: Ubuntu 18.04, Kernel: 5.0.0-050000rc6-generic (x86_64) 20190210, Desktop: GNOME Shell 3.28.3, Display Server: X Server, Compiler: GCC 9.0.1 20190210, File-System: ext4, Screen Resolution: 1600x1200 -O2: Processor: 2 x AMD EPYC 7601 32-Core (64 Cores / 128 Threads), Motherboard: Dell 02MJ3T (1.2.5 BIOS), Chipset: AMD Family 17h, Memory: 16 x 32 GB DDR4-2400MT/s 36ASF4G72PZ-2G6D2, Disk: 120GB SSDSCKJB120G7R + 20 x 500GB Samsung SSD 860, Graphics: Matrox G200eW3, Monitor: VE228, Network: 2 x Broadcom BCM57416 NetXtreme-E 10GBase-T RDMA + 2 x Broadcom NetXtreme BCM5720 PCIe OS: Ubuntu 18.04, Kernel: 5.0.0-050000rc6-generic (x86_64) 20190210, Desktop: GNOME Shell 3.28.3, Display Server: X Server, Compiler: GCC 9.0.1 20190210, File-System: ext4, Screen Resolution: 1600x1200 -O2 -ftree-vectorize -ftree-slp-vectorize: Processor: 2 x AMD EPYC 7601 32-Core (64 Cores / 128 Threads), Motherboard: Dell 02MJ3T (1.2.5 BIOS), Chipset: AMD Family 17h, Memory: 16 x 32 GB DDR4-2400MT/s 36ASF4G72PZ-2G6D2, Disk: 120GB SSDSCKJB120G7R + 20 x 500GB Samsung SSD 860, Graphics: Matrox G200eW3, Monitor: VE228, Network: 2 x Broadcom BCM57416 NetXtreme-E 10GBase-T RDMA + 2 x Broadcom NetXtreme BCM5720 PCIe OS: Ubuntu 18.04, Kernel: 5.0.0-050000rc6-generic (x86_64) 20190210, Desktop: GNOME Shell 3.28.3, Display Server: X Server, Compiler: GCC 9.0.1 20190210, File-System: ext4, Screen Resolution: 1600x1200 -O2 -march=znver1: Processor: 2 x AMD EPYC 7601 32-Core (64 Cores / 128 Threads), Motherboard: Dell 02MJ3T (1.2.5 BIOS), Chipset: AMD Family 17h, Memory: 16 x 32 GB DDR4-2400MT/s 36ASF4G72PZ-2G6D2, Disk: 120GB SSDSCKJB120G7R + 20 x 500GB Samsung SSD 860, Graphics: Matrox G200eW3, Monitor: VE228, Network: 2 x Broadcom BCM57416 NetXtreme-E 10GBase-T RDMA + 2 x Broadcom NetXtreme BCM5720 PCIe OS: Ubuntu 18.04, Kernel: 5.0.0-050000rc6-generic (x86_64) 20190210, Desktop: GNOME Shell 3.28.3, Display Server: X Server, Compiler: GCC 9.0.1 20190210, File-System: ext4, Screen Resolution: 1600x1200 -O2 -flto: Processor: 2 x AMD EPYC 7601 32-Core (64 Cores / 128 Threads), Motherboard: Dell 02MJ3T (1.2.5 BIOS), Chipset: AMD Family 17h, Memory: 16 x 32 GB DDR4-2400MT/s 36ASF4G72PZ-2G6D2, Disk: 120GB SSDSCKJB120G7R + 20 x 500GB Samsung SSD 860, Graphics: Matrox G200eW3, Monitor: VE228, Network: 2 x Broadcom BCM57416 NetXtreme-E 10GBase-T RDMA + 2 x Broadcom NetXtreme BCM5720 PCIe OS: Ubuntu 18.04, Kernel: 5.0.0-050000rc6-generic (x86_64) 20190210, Desktop: GNOME Shell 3.28.3, Display Server: X Server, Compiler: GCC 9.0.1 20190210, File-System: ext4, Screen Resolution: 1600x1200 -O3: Processor: 2 x AMD EPYC 7601 32-Core (64 Cores / 128 Threads), Motherboard: Dell 02MJ3T (1.2.5 BIOS), Chipset: AMD Family 17h, Memory: 16 x 32 GB DDR4-2400MT/s 36ASF4G72PZ-2G6D2, Disk: 120GB SSDSCKJB120G7R + 20 x 500GB Samsung SSD 860, Graphics: Matrox G200eW3, Monitor: VE228, Network: 2 x Broadcom BCM57416 NetXtreme-E 10GBase-T RDMA + 2 x Broadcom NetXtreme BCM5720 PCIe OS: Ubuntu 18.04, Kernel: 5.0.0-050000rc6-generic (x86_64) 20190210, Desktop: GNOME Shell 3.28.3, Display Server: X Server, Compiler: GCC 9.0.1 20190210, File-System: ext4, Screen Resolution: 1600x1200 -O3 -march=znver1: Processor: 2 x AMD EPYC 7601 32-Core (64 Cores / 128 Threads), Motherboard: Dell 02MJ3T (1.2.5 BIOS), Chipset: AMD Family 17h, Memory: 16 x 32 GB DDR4-2400MT/s 36ASF4G72PZ-2G6D2, Disk: 120GB SSDSCKJB120G7R + 20 x 500GB Samsung SSD 860, Graphics: Matrox G200eW3, Monitor: VE228, Network: 2 x Broadcom BCM57416 NetXtreme-E 10GBase-T RDMA + 2 x Broadcom NetXtreme BCM5720 PCIe OS: Ubuntu 18.04, Kernel: 5.0.0-050000rc6-generic (x86_64) 20190210, Desktop: GNOME Shell 3.28.3, Display Server: X Server, Compiler: GCC 9.0.1 20190210, File-System: ext4, Screen Resolution: 1600x1200 -O3 -march=znver1 -flto: Processor: 2 x AMD EPYC 7601 32-Core (64 Cores / 128 Threads), Motherboard: Dell 02MJ3T (1.2.5 BIOS), Chipset: AMD Family 17h, Memory: 16 x 32 GB DDR4-2400MT/s 36ASF4G72PZ-2G6D2, Disk: 120GB SSDSCKJB120G7R + 20 x 500GB Samsung SSD 860, Graphics: Matrox G200eW3, Monitor: VE228, Network: 2 x Broadcom BCM57416 NetXtreme-E 10GBase-T RDMA + 2 x Broadcom NetXtreme BCM5720 PCIe OS: Ubuntu 18.04, Kernel: 5.0.0-050000rc6-generic (x86_64) 20190210, Desktop: GNOME Shell 3.28.3, Display Server: X Server, Compiler: GCC 9.0.1 20190210, File-System: ext4, Screen Resolution: 1600x1200 -Ofast -march=znver1: Processor: 2 x AMD EPYC 7601 32-Core (64 Cores / 128 Threads), Motherboard: Dell 02MJ3T (1.2.5 BIOS), Chipset: AMD Family 17h, Memory: 16 x 32 GB DDR4-2400MT/s 36ASF4G72PZ-2G6D2, Disk: 120GB SSDSCKJB120G7R + 20 x 500GB Samsung SSD 860, Graphics: Matrox G200eW3, Monitor: VE228, Network: 2 x Broadcom BCM57416 NetXtreme-E 10GBase-T RDMA + 2 x Broadcom NetXtreme BCM5720 PCIe OS: Ubuntu 18.04, Kernel: 5.0.0-050000rc6-generic (x86_64) 20190210, Desktop: GNOME Shell 3.28.3, Display Server: X Server, Compiler: GCC 9.0.1 20190210, File-System: ext4, Screen Resolution: 1600x1200 FLAC Audio Encoding 1.3.2 WAV To FLAC Seconds < Lower Is Better -O0 ....................................... 96.77 |============================ -Og ....................................... 15.58 |===== -O1 ....................................... 15.01 |==== -O2 ....................................... 13.65 |==== -O2 -ftree-vectorize -ftree-slp-vectorize . 13.70 |==== -O2 -march=znver1 ......................... 13.89 |==== -O2 -flto ................................. 13.64 |==== -O3 ....................................... 13.61 |==== -O3 -march=znver1 ......................... 13.85 |==== -O3 -march=znver1 -flto ................... 14.21 |==== -Ofast -march=znver1 ...................... 13.95 |==== FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 4096 Mflops > Higher Is Better -O0 ....................................... 2193 |===== -Og ....................................... 12642 |========================== -O1 ....................................... 13468 |============================ -O2 ....................................... 13391 |============================ -O2 -ftree-vectorize -ftree-slp-vectorize . 13285 |=========================== -O2 -march=znver1 ......................... 13346 |============================ -O2 -flto ................................. 13214 |=========================== -O3 ....................................... 13555 |============================ -O3 -march=znver1 ......................... 12752 |========================== -O3 -march=znver1 -flto ................... 13110 |=========================== -Ofast -march=znver1 ...................... 13166 |=========================== Timed PHP Compilation 7.1.9 Time To Compile Seconds < Lower Is Better -O0 ....................................... 15.19 |===== -Og ....................................... 21.42 |======== -O1 ....................................... 29.05 |========== -O2 ....................................... 52.17 |=================== -O2 -ftree-vectorize -ftree-slp-vectorize . 52.58 |=================== -O2 -march=znver1 ......................... 51.96 |=================== -O3 ....................................... 78.19 |============================ -O3 -march=znver1 ......................... 78.13 |============================ SciMark 2.0 Computational Test: Sparse Matrix Multiply Mflops > Higher Is Better -O0 ....................................... 516 |====== -Og ....................................... 2188 |========================= -O1 ....................................... 2411 |=========================== -O2 ....................................... 2527 |============================ -O2 -ftree-vectorize -ftree-slp-vectorize . 2515 |============================ -O2 -march=znver1 ......................... 2584 |============================= -O2 -flto ................................. 2299 |========================== -O3 ....................................... 2475 |============================ -O3 -march=znver1 ......................... 2482 |============================ -O3 -march=znver1 -flto ................... 2052 |======================= -Ofast -march=znver1 ...................... 2579 |============================= SciMark 2.0 Computational Test: Composite Mflops > Higher Is Better -O0 ....................................... 434 |====== -Og ....................................... 1205 |================== -O1 ....................................... 1519 |====================== -O2 ....................................... 1369 |==================== -O2 -ftree-vectorize -ftree-slp-vectorize . 1724 |========================= -O2 -march=znver1 ......................... 1501 |====================== -O2 -flto ................................. 1307 |=================== -O3 ....................................... 1800 |=========================== -O3 -march=znver1 ......................... 1961 |============================= -O3 -march=znver1 -flto ................... 1747 |========================== -Ofast -march=znver1 ...................... 1825 |=========================== C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel Seconds < Lower Is Better -O0 ....................................... 44.92 |============================ -Og ....................................... 28.64 |================== -O1 ....................................... 28.74 |================== -O2 ....................................... 25.84 |================ -O2 -ftree-vectorize -ftree-slp-vectorize . 25.77 |================ -O2 -march=znver1 ......................... 21.58 |============= -O2 -flto ................................. 25.96 |================ -O3 ....................................... 12.60 |======== -O3 -march=znver1 ......................... 11.35 |======= -O3 -march=znver1 -flto ................... 11.31 |======= -Ofast -march=znver1 ...................... 10.40 |====== LAME MP3 Encoding 3.100 WAV To MP3 Seconds < Lower Is Better -O0 ....................................... 41.79 |============================ -Og ....................................... 16.78 |=========== -O1 ....................................... 14.32 |========== -O2 ....................................... 14.07 |========= -O2 -ftree-vectorize -ftree-slp-vectorize . 10.96 |======= -O2 -march=znver1 ......................... 14.00 |========= -O2 -flto ................................. 14.14 |========= -O3 ....................................... 10.84 |======= -O3 -march=znver1 ......................... 10.57 |======= -O3 -march=znver1 -flto ................... 10.38 |======= -Ofast -march=znver1 ...................... 9.80 |======= FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 4096 Mflops > Higher Is Better -O0 ....................................... 1708 |========= -Og ....................................... 4366 |======================= -O1 ....................................... 4632 |======================== -O2 ....................................... 4625 |======================== -O2 -ftree-vectorize -ftree-slp-vectorize . 4805 |========================= -O2 -march=znver1 ......................... 5074 |========================== -O2 -flto ................................. 5091 |=========================== -O3 ....................................... 4751 |========================= -O3 -march=znver1 ......................... 5006 |========================== -O3 -march=znver1 -flto ................... 5571 |============================= -Ofast -march=znver1 ...................... 4885 |========================= Timed ImageMagick Compilation 6.9.0 Time To Compile Seconds < Lower Is Better -O0 ....................................... 5.23 |= -Og ....................................... 7.89 |== -O1 ....................................... 18.42 |==== -O2 ....................................... 23.63 |===== -O2 -ftree-vectorize -ftree-slp-vectorize . 23.91 |===== -O2 -march=znver1 ......................... 23.78 |===== -O2 -flto ................................. 98.67 |====================== -O3 ....................................... 25.06 |====== -O3 -march=znver1 ......................... 24.88 |====== -O3 -march=znver1 -flto ................... 118.48 |=========================== -Ofast -march=znver1 ...................... 25.21 |====== Himeno Benchmark 3.0 Poisson Pressure Solver MFLOPS > Higher Is Better -O0 ....................................... 383 |=========== -Og ....................................... 772 |====================== -O1 ....................................... 785 |====================== -O2 ....................................... 1017 |============================= -O2 -ftree-vectorize -ftree-slp-vectorize . 1007 |============================= -O2 -march=znver1 ......................... 1001 |============================ -O2 -flto ................................. 1022 |============================= -O3 ....................................... 1008 |============================= -O3 -march=znver1 ......................... 1011 |============================= -O3 -march=znver1 -flto ................... 1000 |============================ -Ofast -march=znver1 ...................... 1022 |============================= Timed Apache Compilation 2.4.7 Time To Compile Seconds < Lower Is Better -O0 ....................................... 11.43 |=========== -Og ....................................... 14.59 |============== -O1 ....................................... 17.51 |================= -O2 ....................................... 23.82 |======================= -O2 -ftree-vectorize -ftree-slp-vectorize . 24.03 |======================== -O2 -march=znver1 ......................... 23.82 |======================= -O2 -flto ................................. 26.50 |========================== -O3 ....................................... 26.08 |========================== -O3 -march=znver1 ......................... 25.94 |========================= -O3 -march=znver1 -flto ................... 28.62 |============================ -Ofast -march=znver1 ...................... 26.11 |========================== GraphicsMagick 1.3.30 Operation: Sharpen Iterations Per Minute > Higher Is Better -O0 ....................................... 82 |============= -Og ....................................... 156 |========================== -O1 ....................................... 180 |============================== -O2 ....................................... 181 |============================== -O2 -ftree-vectorize -ftree-slp-vectorize . 180 |============================== -O2 -march=znver1 ......................... 183 |============================== -O2 -flto ................................. 183 |============================== -O3 ....................................... 174 |============================= -O3 -march=znver1 ......................... 183 |============================== -O3 -march=znver1 -flto ................... 183 |============================== -Ofast -march=znver1 ...................... 182 |============================== GraphicsMagick 1.3.30 Operation: Enhanced Iterations Per Minute > Higher Is Better -O0 ....................................... 90 |============== -Og ....................................... 173 |=========================== -O1 ....................................... 187 |============================= -O2 ....................................... 189 |============================= -O2 -ftree-vectorize -ftree-slp-vectorize . 188 |============================= -O2 -march=znver1 ......................... 191 |============================== -O2 -flto ................................. 190 |============================== -O3 ....................................... 181 |============================ -O3 -march=znver1 ......................... 191 |============================== -O3 -march=znver1 -flto ................... 186 |============================= -Ofast -march=znver1 ...................... 193 |============================== GraphicsMagick 1.3.30 Operation: HWB Color Space Iterations Per Minute > Higher Is Better -O0 ....................................... 102 |============== -Og ....................................... 195 |=========================== -O1 ....................................... 210 |============================= -O2 ....................................... 211 |============================== -O2 -ftree-vectorize -ftree-slp-vectorize . 212 |============================== -O2 -march=znver1 ......................... 211 |============================== -O2 -flto ................................. 214 |============================== -O3 ....................................... 203 |============================ -O3 -march=znver1 ......................... 210 |============================= -O3 -march=znver1 -flto ................... 209 |============================= -Ofast -march=znver1 ...................... 209 |============================= GraphicsMagick 1.3.30 Operation: Swirl Iterations Per Minute > Higher Is Better -O0 ....................................... 96 |=============== -Og ....................................... 181 |============================ -O1 ....................................... 194 |============================== -O2 ....................................... 195 |============================== -O2 -ftree-vectorize -ftree-slp-vectorize . 196 |============================== -O2 -march=znver1 ......................... 196 |============================== -O2 -flto ................................. 196 |============================== -O3 ....................................... 189 |============================= -O3 -march=znver1 ......................... 195 |============================== -O3 -march=znver1 -flto ................... 194 |============================== -Ofast -march=znver1 ...................... 196 |============================== GraphicsMagick 1.3.30 Operation: Noise-Gaussian Iterations Per Minute > Higher Is Better -O0 ....................................... 92 |=============== -Og ....................................... 168 |=========================== -O1 ....................................... 179 |============================= -O2 ....................................... 180 |============================= -O2 -ftree-vectorize -ftree-slp-vectorize . 178 |============================= -O2 -march=znver1 ......................... 180 |============================= -O2 -flto ................................. 180 |============================= -O3 ....................................... 172 |============================ -O3 -march=znver1 ......................... 180 |============================= -O3 -march=znver1 -flto ................... 178 |============================= -Ofast -march=znver1 ...................... 187 |============================== SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation Mflops > Higher Is Better -O0 ....................................... 832 |============== -Og ....................................... 919 |================ -O1 ....................................... 919 |================ -O2 ....................................... 919 |================ -O2 -ftree-vectorize -ftree-slp-vectorize . 919 |================ -O2 -march=znver1 ......................... 1016 |================= -O2 -flto ................................. 918 |================ -O3 ....................................... 1427 |========================= -O3 -march=znver1 ......................... 1689 |============================= -O3 -march=znver1 -flto ................... 1675 |============================= -Ofast -march=znver1 ...................... 1676 |============================= SciMark 2.0 Computational Test: Monte Carlo Mflops > Higher Is Better -O0 ....................................... 108 |== -Og ....................................... 210 |==== -O1 ....................................... 576 |=========== -O2 ....................................... 560 |=========== -O2 -ftree-vectorize -ftree-slp-vectorize . 560 |=========== -O2 -march=znver1 ......................... 557 |=========== -O2 -flto ................................. 568 |=========== -O3 ....................................... 560 |=========== -O3 -march=znver1 ......................... 557 |=========== -O3 -march=znver1 -flto ................... 1480 |============================= -Ofast -march=znver1 ...................... 561 |=========== GraphicsMagick 1.3.30 Operation: Rotate Iterations Per Minute > Higher Is Better -O0 ....................................... 98 |=============== -Og ....................................... 181 |============================ -O1 ....................................... 191 |============================== -O2 ....................................... 191 |============================== -O2 -ftree-vectorize -ftree-slp-vectorize . 190 |============================== -O2 -march=znver1 ......................... 191 |============================== -O2 -flto ................................. 191 |============================== -O3 ....................................... 183 |============================= -O3 -march=znver1 ......................... 190 |============================== -O3 -march=znver1 -flto ................... 188 |============================== -Ofast -march=znver1 ...................... 189 |============================== AOBench Size: 2048 x 2048 - Total Time Seconds < Lower Is Better -O0 ....................................... 92.50 |============================ -Og ....................................... 77.41 |======================= -O1 ....................................... 56.61 |================= -O2 ....................................... 55.54 |================= -O2 -ftree-vectorize -ftree-slp-vectorize . 55.53 |================= -O2 -march=znver1 ......................... 54.35 |================ -O2 -flto ................................. 55.52 |================= -O3 ....................................... 53.53 |================ -O3 -march=znver1 ......................... 51.49 |================ -O3 -march=znver1 -flto ................... 52.08 |================ -Ofast -march=znver1 ...................... 51.73 |================ GraphicsMagick 1.3.30 Operation: Resizing Iterations Per Minute > Higher Is Better -O0 ....................................... 74 |================= -Og ....................................... 120 |=========================== -O1 ....................................... 126 |============================= -O2 ....................................... 131 |============================== -O2 -ftree-vectorize -ftree-slp-vectorize . 128 |============================= -O2 -march=znver1 ......................... 127 |============================= -O2 -flto ................................. 128 |============================= -O3 ....................................... 118 |=========================== -O3 -march=znver1 ......................... 127 |============================= -O3 -march=znver1 -flto ................... 125 |============================= -Ofast -march=znver1 ...................... 124 |============================ PostgreSQL pgbench 10.3 Scaling: Buffer Test - Test: Single Thread - Mode: Read Only TPS > Higher Is Better -O0 ....................................... 9063 |================ -Og ....................................... 13333 |======================= -O1 ....................................... 13303 |======================= -O2 ....................................... 14931 |========================== -O2 -ftree-vectorize -ftree-slp-vectorize . 15353 |=========================== -O2 -march=znver1 ......................... 15111 |========================== -O2 -flto ................................. 14851 |========================== -O3 ....................................... 15099 |========================== -O3 -march=znver1 ......................... 15188 |=========================== -O3 -march=znver1 -flto ................... 16012 |============================ -Ofast -march=znver1 ...................... 15352 |=========================== Timed HMMer Search 2.3.2 Pfam Database Search Seconds < Lower Is Better -O0 ....................................... 9.02 |============================= -Og ....................................... 7.39 |======================== -O1 ....................................... 6.93 |====================== -O2 ....................................... 6.62 |===================== -O2 -ftree-vectorize -ftree-slp-vectorize . 6.82 |====================== -O2 -march=znver1 ......................... 6.54 |===================== -O2 -flto ................................. 6.56 |===================== -O3 ....................................... 6.57 |===================== -O3 -march=znver1 ......................... 6.29 |==================== -O3 -march=znver1 -flto ................... 6.16 |==================== -Ofast -march=znver1 ...................... 6.00 |=================== x264 2018-09-25 H.264 Video Encoding Frames Per Second > Higher Is Better -O0 ....................................... 102 |===================== -Og ....................................... 142 |============================= -O1 ....................................... 145 |============================== -O2 ....................................... 144 |============================= -O2 -ftree-vectorize -ftree-slp-vectorize . 144 |============================= -O2 -march=znver1 ......................... 144 |============================= -O3 ....................................... 147 |============================== -O3 -march=znver1 ......................... 144 |============================= -Ofast -march=znver1 ...................... 144 |============================= PostgreSQL pgbench 10.3 Scaling: Buffer Test - Test: Normal Load - Mode: Read Write TPS > Higher Is Better -O0 ....................................... 4585 |========================== -Og ....................................... 3767 |====================== -O1 ....................................... 4301 |========================= -O2 ....................................... 4167 |======================== -O2 -ftree-vectorize -ftree-slp-vectorize . 4239 |======================== -O2 -march=znver1 ......................... 4272 |======================== -O2 -flto ................................. 4095 |======================= -O3 ....................................... 4262 |======================== -O3 -march=znver1 ......................... 5068 |============================= -O3 -march=znver1 -flto ................... 4319 |========================= -Ofast -march=znver1 ...................... 4102 |======================= libjpeg-turbo tjbench 1.5.3 Test: Decompression Throughput Megapixels/sec > Higher Is Better -O0 ....................................... 111 |======================= -Og ....................................... 141 |============================= -O1 ....................................... 139 |============================= -O2 ....................................... 140 |============================= -O2 -ftree-vectorize -ftree-slp-vectorize . 139 |============================= -O2 -march=znver1 ......................... 142 |============================== -O2 -flto ................................. 140 |============================= -O3 ....................................... 141 |============================= -O3 -march=znver1 ......................... 144 |============================== -O3 -march=znver1 -flto ................... 144 |============================== -Ofast -march=znver1 ...................... 144 |============================== PostgreSQL pgbench 10.3 Scaling: Buffer Test - Test: Single Thread - Mode: Read Write TPS > Higher Is Better -O0 ....................................... 886 |====================== -Og ....................................... 1080 |=========================== -O1 ....................................... 1065 |=========================== -O2 ....................................... 1037 |========================== -O2 -ftree-vectorize -ftree-slp-vectorize . 1125 |============================ -O2 -march=znver1 ......................... 1060 |=========================== -O2 -flto ................................. 1127 |============================= -O3 ....................................... 1079 |=========================== -O3 -march=znver1 ......................... 1145 |============================= -O3 -march=znver1 -flto ................... 1074 |=========================== -Ofast -march=znver1 ...................... 1125 |============================ SciMark 2.0 Computational Test: Fast Fourier Transform Mflops > Higher Is Better -O0 ....................................... 201 |======================= -Og ....................................... 257 |============================== -O1 ....................................... 226 |========================== -O2 ....................................... 230 |=========================== -O2 -ftree-vectorize -ftree-slp-vectorize . 231 |=========================== -O2 -march=znver1 ......................... 229 |=========================== -O2 -flto ................................. 232 |=========================== -O3 ....................................... 232 |=========================== -O3 -march=znver1 ......................... 227 |========================== -O3 -march=znver1 -flto ................... 230 |=========================== -Ofast -march=znver1 ...................... 221 |========================== PostgreSQL pgbench 10.3 Scaling: Buffer Test - Test: Normal Load - Mode: Read Only TPS > Higher Is Better -O0 ....................................... 419700 |===================== -Og ....................................... 507203 |========================== -O1 ....................................... 515102 |========================== -O2 ....................................... 515340 |========================== -O2 -ftree-vectorize -ftree-slp-vectorize . 529699 |=========================== -O2 -march=znver1 ......................... 510425 |========================== -O2 -flto ................................. 520570 |=========================== -O3 ....................................... 490551 |========================= -O3 -march=znver1 ......................... 505031 |========================== -O3 -march=znver1 -flto ................... 454256 |======================= -Ofast -march=znver1 ...................... 508384 |========================== John The Ripper 1.8.0-jumbo-1 Test: Traditional DES Real C/S > Higher Is Better -O0 ....................................... 218232000 |==================== -Og ....................................... 239289333 |====================== -O1 ....................................... 257067200 |======================== -O2 ....................................... 257407667 |======================== -O2 -ftree-vectorize -ftree-slp-vectorize . 257058000 |======================== -O2 -march=znver1 ......................... 255957000 |======================== -O2 -flto ................................. 260736667 |======================== -O3 ....................................... 253868583 |======================= -O3 -march=znver1 ......................... 260019667 |======================== -O3 -march=znver1 -flto ................... 254777333 |======================= -Ofast -march=znver1 ...................... 258770667 |======================== Bullet Physics Engine 2.81 Test: 1000 Stack Seconds < Lower Is Better -O0 ....................................... 6.00 |============================ -Og ....................................... 6.02 |============================ -O1 ....................................... 6.00 |============================ -O2 ....................................... 6.01 |============================ -O2 -ftree-vectorize -ftree-slp-vectorize . 5.98 |=========================== -O2 -march=znver1 ......................... 5.80 |=========================== -O2 -flto ................................. 6.32 |============================= -O3 ....................................... 6.05 |============================ -O3 -march=znver1 ......................... 5.80 |=========================== -Ofast -march=znver1 ...................... 5.80 |=========================== Hierarchical INTegration 1.0 Test: DOUBLE QUIPs > Higher Is Better -O0 ....................................... 598545342 |======================= -Og ....................................... 597234266 |======================= -O1 ....................................... 585060029 |====================== -O2 ....................................... 599481605 |======================= -O2 -ftree-vectorize -ftree-slp-vectorize . 602535297 |======================= -O2 -march=znver1 ......................... 617516626 |======================== -O2 -flto ................................. 626640400 |======================== -O3 ....................................... 595428047 |======================= -O3 -march=znver1 ......................... 589289926 |======================= -O3 -march=znver1 -flto ................... 618644101 |======================== -Ofast -march=znver1 ...................... 605331833 |======================= Bullet Physics Engine 2.81 Test: 136 Ragdolls Seconds < Lower Is Better -O0 ....................................... 3.14 |============================ -Og ....................................... 3.14 |============================ -O1 ....................................... 3.15 |============================ -O2 ....................................... 3.14 |============================ -O2 -ftree-vectorize -ftree-slp-vectorize . 3.15 |============================ -O2 -march=znver1 ......................... 3.05 |=========================== -O2 -flto ................................. 3.24 |============================= -O3 ....................................... 3.15 |============================ -O3 -march=znver1 ......................... 3.06 |=========================== -Ofast -march=znver1 ...................... 3.06 |=========================== SVT-VP9 2019-02-17 1080p 8-bit YUV To VP9 Video Encode Frames Per Second > Higher Is Better -Og ....................................... 92.68 |=========================== -O2 -ftree-vectorize -ftree-slp-vectorize . 94.82 |=========================== -O2 -march=znver1 ......................... 95.91 |=========================== -O2 -flto ................................. 95.79 |=========================== -O3 -march=znver1 -flto ................... 97.26 |============================ -Ofast -march=znver1 ...................... 97.80 |============================ Bullet Physics Engine 2.81 Test: 1000 Convex Seconds < Lower Is Better -O0 ....................................... 5.36 |============================= -Og ....................................... 5.37 |============================= -O1 ....................................... 5.37 |============================= -O2 ....................................... 5.37 |============================= -O2 -ftree-vectorize -ftree-slp-vectorize . 5.37 |============================= -O2 -march=znver1 ......................... 5.19 |============================ -O2 -flto ................................. 5.40 |============================= -O3 ....................................... 5.38 |============================= -O3 -march=znver1 ......................... 5.18 |============================ -Ofast -march=znver1 ...................... 5.18 |============================ VP9 libvpx Encoding 1.8.0 vpxenc VP9 1080p Video Encode Frames Per Second > Higher Is Better -Og ....................................... 20.39 |=========================== -O2 -ftree-vectorize -ftree-slp-vectorize . 20.34 |=========================== -O2 -march=znver1 ......................... 20.05 |=========================== -O3 -march=znver1 -flto ................... 20.86 |============================ -Ofast -march=znver1 ...................... 20.13 |=========================== SVT-AV1 2019-02-03 1080p 8-bit YUV To AV1 Video Encode Frames Per Second > Higher Is Better -O0 ....................................... 1.69 |============================ -Og ....................................... 1.73 |============================= -O1 ....................................... 1.70 |============================ -O2 ....................................... 1.69 |============================ -O2 -ftree-vectorize -ftree-slp-vectorize . 1.67 |============================ -O2 -march=znver1 ......................... 1.70 |============================ -O2 -flto ................................. 1.69 |============================ -O3 ....................................... 1.73 |============================= -O3 -march=znver1 ......................... 1.68 |============================ -O3 -march=znver1 -flto ................... 1.71 |============================= -Ofast -march=znver1 ...................... 1.70 |============================ VP9 libvpx Encoding 1.8.0 vpxenc VP9 1080p Video Encode Frames Per Second > Higher Is Better -O0 ....................................... 12.50 |=========================== -Og ....................................... 12.52 |=========================== -O1 ....................................... 12.53 |============================ -O2 ....................................... 12.54 |============================ -O2 -ftree-vectorize -ftree-slp-vectorize . 12.56 |============================ -O2 -march=znver1 ......................... 12.42 |=========================== -O3 ....................................... 12.31 |=========================== -O3 -march=znver1 ......................... 12.41 |=========================== -O3 -march=znver1 -flto ................... 12.75 |============================ -Ofast -march=znver1 ...................... 12.37 |=========================== x265 3.0 H.265 1080p Video Encoding Frames Per Second > Higher Is Better -O0 ....................................... 35.00 |============================ -Og ....................................... 34.76 |=========================== -O1 ....................................... 35.62 |============================ -O2 ....................................... 34.55 |=========================== -O2 -ftree-vectorize -ftree-slp-vectorize . 35.41 |============================ -O2 -march=znver1 ......................... 34.80 |=========================== -O2 -flto ................................. 35.07 |============================ -O3 ....................................... 35.21 |============================ -O3 -march=znver1 ......................... 35.57 |============================ -Ofast -march=znver1 ...................... 34.91 |=========================== Bullet Physics Engine 2.81 Test: 3000 Fall Seconds < Lower Is Better -O0 ....................................... 5.14 |============================= -Og ....................................... 5.15 |============================= -O1 ....................................... 5.16 |============================= -O2 ....................................... 5.14 |============================= -O2 -ftree-vectorize -ftree-slp-vectorize . 5.14 |============================= -O2 -march=znver1 ......................... 5.08 |============================ -O2 -flto ................................. 5.21 |============================= -O3 ....................................... 5.16 |============================= -O3 -march=znver1 ......................... 5.07 |============================ -Ofast -march=znver1 ...................... 5.09 |============================ Bullet Physics Engine 2.81 Test: Raytests Seconds < Lower Is Better -O0 ....................................... 3.11 |============================= -Og ....................................... 3.11 |============================= -O1 ....................................... 3.12 |============================= -O2 ....................................... 3.12 |============================= -O2 -ftree-vectorize -ftree-slp-vectorize . 3.11 |============================= -O2 -march=znver1 ......................... 3.10 |============================= -O2 -flto ................................. 3.05 |============================ -O3 ....................................... 3.11 |============================= -O3 -march=znver1 ......................... 3.09 |============================= -Ofast -march=znver1 ...................... 3.09 |============================= Stockfish 9 Total Time Nodes Per Second > Higher Is Better -O0 ....................................... 105868175 |======================== -Og ....................................... 105709690 |======================== -O1 ....................................... 105698092 |======================== -O2 ....................................... 104480422 |======================== -O2 -ftree-vectorize -ftree-slp-vectorize . 104197865 |======================= -O2 -march=znver1 ......................... 106084276 |======================== -O2 -flto ................................. 104536605 |======================== -O3 ....................................... 104121840 |======================= -O3 -march=znver1 ......................... 106497994 |======================== -Ofast -march=znver1 ...................... 106507244 |======================== Bullet Physics Engine 2.81 Test: Convex Trimesh Seconds < Lower Is Better -O0 ....................................... 1.35 |============================= -Og ....................................... 1.35 |============================= -O1 ....................................... 1.35 |============================= -O2 ....................................... 1.35 |============================= -O2 -ftree-vectorize -ftree-slp-vectorize . 1.35 |============================= -O2 -march=znver1 ......................... 1.32 |============================ -O2 -flto ................................. 1.33 |============================= -O3 ....................................... 1.35 |============================= -O3 -march=znver1 ......................... 1.32 |============================ -Ofast -march=znver1 ...................... 1.32 |============================ Bullet Physics Engine 2.81 Test: Prim Trimesh Seconds < Lower Is Better -O0 ....................................... 1.11 |============================= -Og ....................................... 1.11 |============================= -O1 ....................................... 1.11 |============================= -O2 ....................................... 1.11 |============================= -O2 -ftree-vectorize -ftree-slp-vectorize . 1.11 |============================= -O2 -march=znver1 ......................... 1.11 |============================= -O2 -flto ................................. 1.09 |============================ -O3 ....................................... 1.11 |============================= -O3 -march=znver1 ......................... 1.11 |============================= -Ofast -march=znver1 ...................... 1.11 |============================= SVT-AV1 2019-02-15 1080p 8-bit YUV To AV1 Video Encode Frames Per Second > Higher Is Better -O0 ....................................... 5.88 |============================= -Og ....................................... 5.86 |============================= -O1 ....................................... 5.87 |============================= -O2 ....................................... 5.81 |============================= -O2 -ftree-vectorize -ftree-slp-vectorize . 5.89 |============================= -O2 -march=znver1 ......................... 5.84 |============================= -O2 -flto ................................. 5.90 |============================= -O3 ....................................... 5.90 |============================= -O3 -march=znver1 ......................... 5.89 |============================= -O3 -march=znver1 -flto ................... 5.84 |============================= -Ofast -march=znver1 ...................... 5.91 |============================= Hierarchical INTegration 1.0 Test: FLOAT QUIPs > Higher Is Better -O0 ....................................... 267404445 |======================== -Og ....................................... 267368671 |======================== -O1 ....................................... 268455578 |======================== -O2 ....................................... 267311970 |======================== -O2 -ftree-vectorize -ftree-slp-vectorize . 267172145 |======================== -O2 -march=znver1 ......................... 267268023 |======================== -O2 -flto ................................. 268173400 |======================== -O3 ....................................... 267315647 |======================== -O3 -march=znver1 ......................... 268506472 |======================== -O3 -march=znver1 -flto ................... 267239405 |======================== -Ofast -march=znver1 ...................... 267055407 |======================== TSCP 1.81 AI Chess Performance Nodes Per Second > Higher Is Better -O0 ....................................... 865459 |=========================== -Og ....................................... 865187 |=========================== -O1 ....................................... 864102 |=========================== -O2 ....................................... 864373 |=========================== -O2 -ftree-vectorize -ftree-slp-vectorize . 864916 |=========================== -O2 -march=znver1 ......................... 864915 |=========================== -O2 -flto ................................. 864101 |=========================== -O3 ....................................... 864915 |=========================== -O3 -march=znver1 ......................... 865732 |=========================== -O3 -march=znver1 -flto ................... 863018 |=========================== -Ofast -march=znver1 ...................... 864373 |=========================== ctx_clock Context Switch Time Clocks < Lower Is Better -Og ....................................... 132 |============================== -O2 -ftree-vectorize -ftree-slp-vectorize . 132 |============================== -O2 -flto ................................. 132 |============================== -O3 -march=znver1 -flto ................... 132 |============================== Zstd Compression 1.3.4 Compressing ubuntu-16.04.3-server-i386.img, Compression Level 19 Seconds < Lower Is Better -O0 ....................................... 23.12 |============================ -Og ....................................... 14.39 |================= -O1 ....................................... 14.11 |================= -O2 ....................................... 14.48 |================== -O2 -ftree-vectorize -ftree-slp-vectorize . 13.67 |================= -O2 -march=znver1 ......................... 14.71 |================== -O2 -flto ................................. 14.08 |================= -O3 ....................................... 13.66 |================= -O3 -march=znver1 ......................... 14.37 |================= -O3 -march=znver1 -flto ................... 13.16 |================ -Ofast -march=znver1 ...................... 13.77 |================= John The Ripper 1.8.0-jumbo-1 Test: Blowfish Real C/S > Higher Is Better -O0 ....................................... 15179 |====== -Og ....................................... 56453 |======================== -O1 ....................................... 65995 |============================ -O2 ....................................... 62718 |========================== -O2 -ftree-vectorize -ftree-slp-vectorize . 63586 |=========================== -O2 -march=znver1 ......................... 61309 |========================== -O2 -flto ................................. 65117 |=========================== -O3 ....................................... 65806 |============================ -O3 -march=znver1 ......................... 66823 |============================ -O3 -march=znver1 -flto ................... 58764 |========================= -Ofast -march=znver1 ...................... 62841 |========================== SciMark 2.0 Computational Test: Dense LU Matrix Factorization Mflops > Higher Is Better -O0 ....................................... 512 |=== -Og ....................................... 2539 |=============== -O1 ....................................... 3466 |===================== -O2 ....................................... 2609 |================ -O2 -ftree-vectorize -ftree-slp-vectorize . 4396 |========================== -O2 -march=znver1 ......................... 3231 |=================== -O2 -flto ................................. 2515 |=============== -O3 ....................................... 4307 |========================== -O3 -march=znver1 ......................... 4851 |============================= -O3 -march=znver1 -flto ................... 3300 |==================== -Ofast -march=znver1 ...................... 4089 |========================