GCC 9 compiler tuning benchmarks by Michael Larabel for a future article on Phoronix.com.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 1902194-SP-AMDEPYCCO19
AMD EPYC Compiler Tuning
GCC 9 compiler tuning benchmarks by Michael Larabel for a future article on Phoronix.com.
-O0:
Processor: 2 x AMD EPYC 7601 32-Core (64 Cores / 128 Threads), Motherboard: Dell 02MJ3T (1.2.5 BIOS), Chipset: AMD Family 17h, Memory: 16 x 32 GB DDR4-2400MT/s 36ASF4G72PZ-2G6D2, Disk: 120GB SSDSCKJB120G7R + 20 x 500GB Samsung SSD 860, Graphics: Matrox G200eW3, Monitor: VE228, Network: 2 x Broadcom BCM57416 NetXtreme-E 10GBase-T RDMA + 2 x Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 18.04, Kernel: 5.0.0-050000rc6-generic (x86_64) 20190210, Desktop: GNOME Shell 3.28.3, Display Server: X Server, Compiler: GCC 9.0.1 20190210, File-System: ext4, Screen Resolution: 1600x1200
-Og:
Processor: 2 x AMD EPYC 7601 32-Core (64 Cores / 128 Threads), Motherboard: Dell 02MJ3T (1.2.5 BIOS), Chipset: AMD Family 17h, Memory: 16 x 32 GB DDR4-2400MT/s 36ASF4G72PZ-2G6D2, Disk: 120GB SSDSCKJB120G7R + 20 x 500GB Samsung SSD 860, Graphics: Matrox G200eW3, Monitor: VE228, Network: 2 x Broadcom BCM57416 NetXtreme-E 10GBase-T RDMA + 2 x Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 18.04, Kernel: 5.0.0-050000rc6-generic (x86_64) 20190210, Desktop: GNOME Shell 3.28.3, Display Server: X Server, Compiler: GCC 9.0.1 20190210, File-System: ext4, Screen Resolution: 1600x1200
-O1:
Processor: 2 x AMD EPYC 7601 32-Core (64 Cores / 128 Threads), Motherboard: Dell 02MJ3T (1.2.5 BIOS), Chipset: AMD Family 17h, Memory: 16 x 32 GB DDR4-2400MT/s 36ASF4G72PZ-2G6D2, Disk: 120GB SSDSCKJB120G7R + 20 x 500GB Samsung SSD 860, Graphics: Matrox G200eW3, Monitor: VE228, Network: 2 x Broadcom BCM57416 NetXtreme-E 10GBase-T RDMA + 2 x Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 18.04, Kernel: 5.0.0-050000rc6-generic (x86_64) 20190210, Desktop: GNOME Shell 3.28.3, Display Server: X Server, Compiler: GCC 9.0.1 20190210, File-System: ext4, Screen Resolution: 1600x1200
-O2:
Processor: 2 x AMD EPYC 7601 32-Core (64 Cores / 128 Threads), Motherboard: Dell 02MJ3T (1.2.5 BIOS), Chipset: AMD Family 17h, Memory: 16 x 32 GB DDR4-2400MT/s 36ASF4G72PZ-2G6D2, Disk: 120GB SSDSCKJB120G7R + 20 x 500GB Samsung SSD 860, Graphics: Matrox G200eW3, Monitor: VE228, Network: 2 x Broadcom BCM57416 NetXtreme-E 10GBase-T RDMA + 2 x Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 18.04, Kernel: 5.0.0-050000rc6-generic (x86_64) 20190210, Desktop: GNOME Shell 3.28.3, Display Server: X Server, Compiler: GCC 9.0.1 20190210, File-System: ext4, Screen Resolution: 1600x1200
-O2 -ftree-vectorize -ftree-slp-vectorize:
Processor: 2 x AMD EPYC 7601 32-Core (64 Cores / 128 Threads), Motherboard: Dell 02MJ3T (1.2.5 BIOS), Chipset: AMD Family 17h, Memory: 16 x 32 GB DDR4-2400MT/s 36ASF4G72PZ-2G6D2, Disk: 120GB SSDSCKJB120G7R + 20 x 500GB Samsung SSD 860, Graphics: Matrox G200eW3, Monitor: VE228, Network: 2 x Broadcom BCM57416 NetXtreme-E 10GBase-T RDMA + 2 x Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 18.04, Kernel: 5.0.0-050000rc6-generic (x86_64) 20190210, Desktop: GNOME Shell 3.28.3, Display Server: X Server, Compiler: GCC 9.0.1 20190210, File-System: ext4, Screen Resolution: 1600x1200
-O2 -march=znver1:
Processor: 2 x AMD EPYC 7601 32-Core (64 Cores / 128 Threads), Motherboard: Dell 02MJ3T (1.2.5 BIOS), Chipset: AMD Family 17h, Memory: 16 x 32 GB DDR4-2400MT/s 36ASF4G72PZ-2G6D2, Disk: 120GB SSDSCKJB120G7R + 20 x 500GB Samsung SSD 860, Graphics: Matrox G200eW3, Monitor: VE228, Network: 2 x Broadcom BCM57416 NetXtreme-E 10GBase-T RDMA + 2 x Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 18.04, Kernel: 5.0.0-050000rc6-generic (x86_64) 20190210, Desktop: GNOME Shell 3.28.3, Display Server: X Server, Compiler: GCC 9.0.1 20190210, File-System: ext4, Screen Resolution: 1600x1200
-O2 -flto:
Processor: 2 x AMD EPYC 7601 32-Core (64 Cores / 128 Threads), Motherboard: Dell 02MJ3T (1.2.5 BIOS), Chipset: AMD Family 17h, Memory: 16 x 32 GB DDR4-2400MT/s 36ASF4G72PZ-2G6D2, Disk: 120GB SSDSCKJB120G7R + 20 x 500GB Samsung SSD 860, Graphics: Matrox G200eW3, Monitor: VE228, Network: 2 x Broadcom BCM57416 NetXtreme-E 10GBase-T RDMA + 2 x Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 18.04, Kernel: 5.0.0-050000rc6-generic (x86_64) 20190210, Desktop: GNOME Shell 3.28.3, Display Server: X Server, Compiler: GCC 9.0.1 20190210, File-System: ext4, Screen Resolution: 1600x1200
-O3:
Processor: 2 x AMD EPYC 7601 32-Core (64 Cores / 128 Threads), Motherboard: Dell 02MJ3T (1.2.5 BIOS), Chipset: AMD Family 17h, Memory: 16 x 32 GB DDR4-2400MT/s 36ASF4G72PZ-2G6D2, Disk: 120GB SSDSCKJB120G7R + 20 x 500GB Samsung SSD 860, Graphics: Matrox G200eW3, Monitor: VE228, Network: 2 x Broadcom BCM57416 NetXtreme-E 10GBase-T RDMA + 2 x Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 18.04, Kernel: 5.0.0-050000rc6-generic (x86_64) 20190210, Desktop: GNOME Shell 3.28.3, Display Server: X Server, Compiler: GCC 9.0.1 20190210, File-System: ext4, Screen Resolution: 1600x1200
-O3 -march=znver1:
Processor: 2 x AMD EPYC 7601 32-Core (64 Cores / 128 Threads), Motherboard: Dell 02MJ3T (1.2.5 BIOS), Chipset: AMD Family 17h, Memory: 16 x 32 GB DDR4-2400MT/s 36ASF4G72PZ-2G6D2, Disk: 120GB SSDSCKJB120G7R + 20 x 500GB Samsung SSD 860, Graphics: Matrox G200eW3, Monitor: VE228, Network: 2 x Broadcom BCM57416 NetXtreme-E 10GBase-T RDMA + 2 x Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 18.04, Kernel: 5.0.0-050000rc6-generic (x86_64) 20190210, Desktop: GNOME Shell 3.28.3, Display Server: X Server, Compiler: GCC 9.0.1 20190210, File-System: ext4, Screen Resolution: 1600x1200
-O3 -march=znver1 -flto:
Processor: 2 x AMD EPYC 7601 32-Core (64 Cores / 128 Threads), Motherboard: Dell 02MJ3T (1.2.5 BIOS), Chipset: AMD Family 17h, Memory: 16 x 32 GB DDR4-2400MT/s 36ASF4G72PZ-2G6D2, Disk: 120GB SSDSCKJB120G7R + 20 x 500GB Samsung SSD 860, Graphics: Matrox G200eW3, Monitor: VE228, Network: 2 x Broadcom BCM57416 NetXtreme-E 10GBase-T RDMA + 2 x Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 18.04, Kernel: 5.0.0-050000rc6-generic (x86_64) 20190210, Desktop: GNOME Shell 3.28.3, Display Server: X Server, Compiler: GCC 9.0.1 20190210, File-System: ext4, Screen Resolution: 1600x1200
-Ofast -march=znver1:
Processor: 2 x AMD EPYC 7601 32-Core (64 Cores / 128 Threads), Motherboard: Dell 02MJ3T (1.2.5 BIOS), Chipset: AMD Family 17h, Memory: 16 x 32 GB DDR4-2400MT/s 36ASF4G72PZ-2G6D2, Disk: 120GB SSDSCKJB120G7R + 20 x 500GB Samsung SSD 860, Graphics: Matrox G200eW3, Monitor: VE228, Network: 2 x Broadcom BCM57416 NetXtreme-E 10GBase-T RDMA + 2 x Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 18.04, Kernel: 5.0.0-050000rc6-generic (x86_64) 20190210, Desktop: GNOME Shell 3.28.3, Display Server: X Server, Compiler: GCC 9.0.1 20190210, File-System: ext4, Screen Resolution: 1600x1200
FLAC Audio Encoding 1.3.2
WAV To FLAC
Seconds < Lower Is Better
-O0 ....................................... 96.77 |============================
-Og ....................................... 15.58 |=====
-O1 ....................................... 15.01 |====
-O2 ....................................... 13.65 |====
-O2 -ftree-vectorize -ftree-slp-vectorize . 13.70 |====
-O2 -march=znver1 ......................... 13.89 |====
-O2 -flto ................................. 13.64 |====
-O3 ....................................... 13.61 |====
-O3 -march=znver1 ......................... 13.85 |====
-O3 -march=znver1 -flto ................... 14.21 |====
-Ofast -march=znver1 ...................... 13.95 |====
FFTW 3.3.6
Build: Float + SSE - Size: 2D FFT Size 4096
Mflops > Higher Is Better
-O0 ....................................... 2193 |=====
-Og ....................................... 12642 |==========================
-O1 ....................................... 13468 |============================
-O2 ....................................... 13391 |============================
-O2 -ftree-vectorize -ftree-slp-vectorize . 13285 |===========================
-O2 -march=znver1 ......................... 13346 |============================
-O2 -flto ................................. 13214 |===========================
-O3 ....................................... 13555 |============================
-O3 -march=znver1 ......................... 12752 |==========================
-O3 -march=znver1 -flto ................... 13110 |===========================
-Ofast -march=znver1 ...................... 13166 |===========================
Timed PHP Compilation 7.1.9
Time To Compile
Seconds < Lower Is Better
-O0 ....................................... 15.19 |=====
-Og ....................................... 21.42 |========
-O1 ....................................... 29.05 |==========
-O2 ....................................... 52.17 |===================
-O2 -ftree-vectorize -ftree-slp-vectorize . 52.58 |===================
-O2 -march=znver1 ......................... 51.96 |===================
-O3 ....................................... 78.19 |============================
-O3 -march=znver1 ......................... 78.13 |============================
SciMark 2.0
Computational Test: Sparse Matrix Multiply
Mflops > Higher Is Better
-O0 ....................................... 516 |======
-Og ....................................... 2188 |=========================
-O1 ....................................... 2411 |===========================
-O2 ....................................... 2527 |============================
-O2 -ftree-vectorize -ftree-slp-vectorize . 2515 |============================
-O2 -march=znver1 ......................... 2584 |=============================
-O2 -flto ................................. 2299 |==========================
-O3 ....................................... 2475 |============================
-O3 -march=znver1 ......................... 2482 |============================
-O3 -march=znver1 -flto ................... 2052 |=======================
-Ofast -march=znver1 ...................... 2579 |=============================
SciMark 2.0
Computational Test: Composite
Mflops > Higher Is Better
-O0 ....................................... 434 |======
-Og ....................................... 1205 |==================
-O1 ....................................... 1519 |======================
-O2 ....................................... 1369 |====================
-O2 -ftree-vectorize -ftree-slp-vectorize . 1724 |=========================
-O2 -march=znver1 ......................... 1501 |======================
-O2 -flto ................................. 1307 |===================
-O3 ....................................... 1800 |===========================
-O3 -march=znver1 ......................... 1961 |=============================
-O3 -march=znver1 -flto ................... 1747 |==========================
-Ofast -march=znver1 ...................... 1825 |===========================
C-Ray 1.1
Total Time - 4K, 16 Rays Per Pixel
Seconds < Lower Is Better
-O0 ....................................... 44.92 |============================
-Og ....................................... 28.64 |==================
-O1 ....................................... 28.74 |==================
-O2 ....................................... 25.84 |================
-O2 -ftree-vectorize -ftree-slp-vectorize . 25.77 |================
-O2 -march=znver1 ......................... 21.58 |=============
-O2 -flto ................................. 25.96 |================
-O3 ....................................... 12.60 |========
-O3 -march=znver1 ......................... 11.35 |=======
-O3 -march=znver1 -flto ................... 11.31 |=======
-Ofast -march=znver1 ...................... 10.40 |======
LAME MP3 Encoding 3.100
WAV To MP3
Seconds < Lower Is Better
-O0 ....................................... 41.79 |============================
-Og ....................................... 16.78 |===========
-O1 ....................................... 14.32 |==========
-O2 ....................................... 14.07 |=========
-O2 -ftree-vectorize -ftree-slp-vectorize . 10.96 |=======
-O2 -march=znver1 ......................... 14.00 |=========
-O2 -flto ................................. 14.14 |=========
-O3 ....................................... 10.84 |=======
-O3 -march=znver1 ......................... 10.57 |=======
-O3 -march=znver1 -flto ................... 10.38 |=======
-Ofast -march=znver1 ...................... 9.80 |=======
FFTW 3.3.6
Build: Stock - Size: 2D FFT Size 4096
Mflops > Higher Is Better
-O0 ....................................... 1708 |=========
-Og ....................................... 4366 |=======================
-O1 ....................................... 4632 |========================
-O2 ....................................... 4625 |========================
-O2 -ftree-vectorize -ftree-slp-vectorize . 4805 |=========================
-O2 -march=znver1 ......................... 5074 |==========================
-O2 -flto ................................. 5091 |===========================
-O3 ....................................... 4751 |=========================
-O3 -march=znver1 ......................... 5006 |==========================
-O3 -march=znver1 -flto ................... 5571 |=============================
-Ofast -march=znver1 ...................... 4885 |=========================
Timed ImageMagick Compilation 6.9.0
Time To Compile
Seconds < Lower Is Better
-O0 ....................................... 5.23 |=
-Og ....................................... 7.89 |==
-O1 ....................................... 18.42 |====
-O2 ....................................... 23.63 |=====
-O2 -ftree-vectorize -ftree-slp-vectorize . 23.91 |=====
-O2 -march=znver1 ......................... 23.78 |=====
-O2 -flto ................................. 98.67 |======================
-O3 ....................................... 25.06 |======
-O3 -march=znver1 ......................... 24.88 |======
-O3 -march=znver1 -flto ................... 118.48 |===========================
-Ofast -march=znver1 ...................... 25.21 |======
Himeno Benchmark 3.0
Poisson Pressure Solver
MFLOPS > Higher Is Better
-O0 ....................................... 383 |===========
-Og ....................................... 772 |======================
-O1 ....................................... 785 |======================
-O2 ....................................... 1017 |=============================
-O2 -ftree-vectorize -ftree-slp-vectorize . 1007 |=============================
-O2 -march=znver1 ......................... 1001 |============================
-O2 -flto ................................. 1022 |=============================
-O3 ....................................... 1008 |=============================
-O3 -march=znver1 ......................... 1011 |=============================
-O3 -march=znver1 -flto ................... 1000 |============================
-Ofast -march=znver1 ...................... 1022 |=============================
Timed Apache Compilation 2.4.7
Time To Compile
Seconds < Lower Is Better
-O0 ....................................... 11.43 |===========
-Og ....................................... 14.59 |==============
-O1 ....................................... 17.51 |=================
-O2 ....................................... 23.82 |=======================
-O2 -ftree-vectorize -ftree-slp-vectorize . 24.03 |========================
-O2 -march=znver1 ......................... 23.82 |=======================
-O2 -flto ................................. 26.50 |==========================
-O3 ....................................... 26.08 |==========================
-O3 -march=znver1 ......................... 25.94 |=========================
-O3 -march=znver1 -flto ................... 28.62 |============================
-Ofast -march=znver1 ...................... 26.11 |==========================
GraphicsMagick 1.3.30
Operation: Sharpen
Iterations Per Minute > Higher Is Better
-O0 ....................................... 82 |=============
-Og ....................................... 156 |==========================
-O1 ....................................... 180 |==============================
-O2 ....................................... 181 |==============================
-O2 -ftree-vectorize -ftree-slp-vectorize . 180 |==============================
-O2 -march=znver1 ......................... 183 |==============================
-O2 -flto ................................. 183 |==============================
-O3 ....................................... 174 |=============================
-O3 -march=znver1 ......................... 183 |==============================
-O3 -march=znver1 -flto ................... 183 |==============================
-Ofast -march=znver1 ...................... 182 |==============================
GraphicsMagick 1.3.30
Operation: Enhanced
Iterations Per Minute > Higher Is Better
-O0 ....................................... 90 |==============
-Og ....................................... 173 |===========================
-O1 ....................................... 187 |=============================
-O2 ....................................... 189 |=============================
-O2 -ftree-vectorize -ftree-slp-vectorize . 188 |=============================
-O2 -march=znver1 ......................... 191 |==============================
-O2 -flto ................................. 190 |==============================
-O3 ....................................... 181 |============================
-O3 -march=znver1 ......................... 191 |==============================
-O3 -march=znver1 -flto ................... 186 |=============================
-Ofast -march=znver1 ...................... 193 |==============================
GraphicsMagick 1.3.30
Operation: HWB Color Space
Iterations Per Minute > Higher Is Better
-O0 ....................................... 102 |==============
-Og ....................................... 195 |===========================
-O1 ....................................... 210 |=============================
-O2 ....................................... 211 |==============================
-O2 -ftree-vectorize -ftree-slp-vectorize . 212 |==============================
-O2 -march=znver1 ......................... 211 |==============================
-O2 -flto ................................. 214 |==============================
-O3 ....................................... 203 |============================
-O3 -march=znver1 ......................... 210 |=============================
-O3 -march=znver1 -flto ................... 209 |=============================
-Ofast -march=znver1 ...................... 209 |=============================
GraphicsMagick 1.3.30
Operation: Swirl
Iterations Per Minute > Higher Is Better
-O0 ....................................... 96 |===============
-Og ....................................... 181 |============================
-O1 ....................................... 194 |==============================
-O2 ....................................... 195 |==============================
-O2 -ftree-vectorize -ftree-slp-vectorize . 196 |==============================
-O2 -march=znver1 ......................... 196 |==============================
-O2 -flto ................................. 196 |==============================
-O3 ....................................... 189 |=============================
-O3 -march=znver1 ......................... 195 |==============================
-O3 -march=znver1 -flto ................... 194 |==============================
-Ofast -march=znver1 ...................... 196 |==============================
GraphicsMagick 1.3.30
Operation: Noise-Gaussian
Iterations Per Minute > Higher Is Better
-O0 ....................................... 92 |===============
-Og ....................................... 168 |===========================
-O1 ....................................... 179 |=============================
-O2 ....................................... 180 |=============================
-O2 -ftree-vectorize -ftree-slp-vectorize . 178 |=============================
-O2 -march=znver1 ......................... 180 |=============================
-O2 -flto ................................. 180 |=============================
-O3 ....................................... 172 |============================
-O3 -march=znver1 ......................... 180 |=============================
-O3 -march=znver1 -flto ................... 178 |=============================
-Ofast -march=znver1 ...................... 187 |==============================
SciMark 2.0
Computational Test: Jacobi Successive Over-Relaxation
Mflops > Higher Is Better
-O0 ....................................... 832 |==============
-Og ....................................... 919 |================
-O1 ....................................... 919 |================
-O2 ....................................... 919 |================
-O2 -ftree-vectorize -ftree-slp-vectorize . 919 |================
-O2 -march=znver1 ......................... 1016 |=================
-O2 -flto ................................. 918 |================
-O3 ....................................... 1427 |=========================
-O3 -march=znver1 ......................... 1689 |=============================
-O3 -march=znver1 -flto ................... 1675 |=============================
-Ofast -march=znver1 ...................... 1676 |=============================
SciMark 2.0
Computational Test: Monte Carlo
Mflops > Higher Is Better
-O0 ....................................... 108 |==
-Og ....................................... 210 |====
-O1 ....................................... 576 |===========
-O2 ....................................... 560 |===========
-O2 -ftree-vectorize -ftree-slp-vectorize . 560 |===========
-O2 -march=znver1 ......................... 557 |===========
-O2 -flto ................................. 568 |===========
-O3 ....................................... 560 |===========
-O3 -march=znver1 ......................... 557 |===========
-O3 -march=znver1 -flto ................... 1480 |=============================
-Ofast -march=znver1 ...................... 561 |===========
GraphicsMagick 1.3.30
Operation: Rotate
Iterations Per Minute > Higher Is Better
-O0 ....................................... 98 |===============
-Og ....................................... 181 |============================
-O1 ....................................... 191 |==============================
-O2 ....................................... 191 |==============================
-O2 -ftree-vectorize -ftree-slp-vectorize . 190 |==============================
-O2 -march=znver1 ......................... 191 |==============================
-O2 -flto ................................. 191 |==============================
-O3 ....................................... 183 |=============================
-O3 -march=znver1 ......................... 190 |==============================
-O3 -march=znver1 -flto ................... 188 |==============================
-Ofast -march=znver1 ...................... 189 |==============================
AOBench
Size: 2048 x 2048 - Total Time
Seconds < Lower Is Better
-O0 ....................................... 92.50 |============================
-Og ....................................... 77.41 |=======================
-O1 ....................................... 56.61 |=================
-O2 ....................................... 55.54 |=================
-O2 -ftree-vectorize -ftree-slp-vectorize . 55.53 |=================
-O2 -march=znver1 ......................... 54.35 |================
-O2 -flto ................................. 55.52 |=================
-O3 ....................................... 53.53 |================
-O3 -march=znver1 ......................... 51.49 |================
-O3 -march=znver1 -flto ................... 52.08 |================
-Ofast -march=znver1 ...................... 51.73 |================
GraphicsMagick 1.3.30
Operation: Resizing
Iterations Per Minute > Higher Is Better
-O0 ....................................... 74 |=================
-Og ....................................... 120 |===========================
-O1 ....................................... 126 |=============================
-O2 ....................................... 131 |==============================
-O2 -ftree-vectorize -ftree-slp-vectorize . 128 |=============================
-O2 -march=znver1 ......................... 127 |=============================
-O2 -flto ................................. 128 |=============================
-O3 ....................................... 118 |===========================
-O3 -march=znver1 ......................... 127 |=============================
-O3 -march=znver1 -flto ................... 125 |=============================
-Ofast -march=znver1 ...................... 124 |============================
PostgreSQL pgbench 10.3
Scaling: Buffer Test - Test: Single Thread - Mode: Read Only
TPS > Higher Is Better
-O0 ....................................... 9063 |================
-Og ....................................... 13333 |=======================
-O1 ....................................... 13303 |=======================
-O2 ....................................... 14931 |==========================
-O2 -ftree-vectorize -ftree-slp-vectorize . 15353 |===========================
-O2 -march=znver1 ......................... 15111 |==========================
-O2 -flto ................................. 14851 |==========================
-O3 ....................................... 15099 |==========================
-O3 -march=znver1 ......................... 15188 |===========================
-O3 -march=znver1 -flto ................... 16012 |============================
-Ofast -march=znver1 ...................... 15352 |===========================
Timed HMMer Search 2.3.2
Pfam Database Search
Seconds < Lower Is Better
-O0 ....................................... 9.02 |=============================
-Og ....................................... 7.39 |========================
-O1 ....................................... 6.93 |======================
-O2 ....................................... 6.62 |=====================
-O2 -ftree-vectorize -ftree-slp-vectorize . 6.82 |======================
-O2 -march=znver1 ......................... 6.54 |=====================
-O2 -flto ................................. 6.56 |=====================
-O3 ....................................... 6.57 |=====================
-O3 -march=znver1 ......................... 6.29 |====================
-O3 -march=znver1 -flto ................... 6.16 |====================
-Ofast -march=znver1 ...................... 6.00 |===================
x264 2018-09-25
H.264 Video Encoding
Frames Per Second > Higher Is Better
-O0 ....................................... 102 |=====================
-Og ....................................... 142 |=============================
-O1 ....................................... 145 |==============================
-O2 ....................................... 144 |=============================
-O2 -ftree-vectorize -ftree-slp-vectorize . 144 |=============================
-O2 -march=znver1 ......................... 144 |=============================
-O3 ....................................... 147 |==============================
-O3 -march=znver1 ......................... 144 |=============================
-Ofast -march=znver1 ...................... 144 |=============================
PostgreSQL pgbench 10.3
Scaling: Buffer Test - Test: Normal Load - Mode: Read Write
TPS > Higher Is Better
-O0 ....................................... 4585 |==========================
-Og ....................................... 3767 |======================
-O1 ....................................... 4301 |=========================
-O2 ....................................... 4167 |========================
-O2 -ftree-vectorize -ftree-slp-vectorize . 4239 |========================
-O2 -march=znver1 ......................... 4272 |========================
-O2 -flto ................................. 4095 |=======================
-O3 ....................................... 4262 |========================
-O3 -march=znver1 ......................... 5068 |=============================
-O3 -march=znver1 -flto ................... 4319 |=========================
-Ofast -march=znver1 ...................... 4102 |=======================
libjpeg-turbo tjbench 1.5.3
Test: Decompression Throughput
Megapixels/sec > Higher Is Better
-O0 ....................................... 111 |=======================
-Og ....................................... 141 |=============================
-O1 ....................................... 139 |=============================
-O2 ....................................... 140 |=============================
-O2 -ftree-vectorize -ftree-slp-vectorize . 139 |=============================
-O2 -march=znver1 ......................... 142 |==============================
-O2 -flto ................................. 140 |=============================
-O3 ....................................... 141 |=============================
-O3 -march=znver1 ......................... 144 |==============================
-O3 -march=znver1 -flto ................... 144 |==============================
-Ofast -march=znver1 ...................... 144 |==============================
PostgreSQL pgbench 10.3
Scaling: Buffer Test - Test: Single Thread - Mode: Read Write
TPS > Higher Is Better
-O0 ....................................... 886 |======================
-Og ....................................... 1080 |===========================
-O1 ....................................... 1065 |===========================
-O2 ....................................... 1037 |==========================
-O2 -ftree-vectorize -ftree-slp-vectorize . 1125 |============================
-O2 -march=znver1 ......................... 1060 |===========================
-O2 -flto ................................. 1127 |=============================
-O3 ....................................... 1079 |===========================
-O3 -march=znver1 ......................... 1145 |=============================
-O3 -march=znver1 -flto ................... 1074 |===========================
-Ofast -march=znver1 ...................... 1125 |============================
SciMark 2.0
Computational Test: Fast Fourier Transform
Mflops > Higher Is Better
-O0 ....................................... 201 |=======================
-Og ....................................... 257 |==============================
-O1 ....................................... 226 |==========================
-O2 ....................................... 230 |===========================
-O2 -ftree-vectorize -ftree-slp-vectorize . 231 |===========================
-O2 -march=znver1 ......................... 229 |===========================
-O2 -flto ................................. 232 |===========================
-O3 ....................................... 232 |===========================
-O3 -march=znver1 ......................... 227 |==========================
-O3 -march=znver1 -flto ................... 230 |===========================
-Ofast -march=znver1 ...................... 221 |==========================
PostgreSQL pgbench 10.3
Scaling: Buffer Test - Test: Normal Load - Mode: Read Only
TPS > Higher Is Better
-O0 ....................................... 419700 |=====================
-Og ....................................... 507203 |==========================
-O1 ....................................... 515102 |==========================
-O2 ....................................... 515340 |==========================
-O2 -ftree-vectorize -ftree-slp-vectorize . 529699 |===========================
-O2 -march=znver1 ......................... 510425 |==========================
-O2 -flto ................................. 520570 |===========================
-O3 ....................................... 490551 |=========================
-O3 -march=znver1 ......................... 505031 |==========================
-O3 -march=znver1 -flto ................... 454256 |=======================
-Ofast -march=znver1 ...................... 508384 |==========================
John The Ripper 1.8.0-jumbo-1
Test: Traditional DES
Real C/S > Higher Is Better
-O0 ....................................... 218232000 |====================
-Og ....................................... 239289333 |======================
-O1 ....................................... 257067200 |========================
-O2 ....................................... 257407667 |========================
-O2 -ftree-vectorize -ftree-slp-vectorize . 257058000 |========================
-O2 -march=znver1 ......................... 255957000 |========================
-O2 -flto ................................. 260736667 |========================
-O3 ....................................... 253868583 |=======================
-O3 -march=znver1 ......................... 260019667 |========================
-O3 -march=znver1 -flto ................... 254777333 |=======================
-Ofast -march=znver1 ...................... 258770667 |========================
Bullet Physics Engine 2.81
Test: 1000 Stack
Seconds < Lower Is Better
-O0 ....................................... 6.00 |============================
-Og ....................................... 6.02 |============================
-O1 ....................................... 6.00 |============================
-O2 ....................................... 6.01 |============================
-O2 -ftree-vectorize -ftree-slp-vectorize . 5.98 |===========================
-O2 -march=znver1 ......................... 5.80 |===========================
-O2 -flto ................................. 6.32 |=============================
-O3 ....................................... 6.05 |============================
-O3 -march=znver1 ......................... 5.80 |===========================
-Ofast -march=znver1 ...................... 5.80 |===========================
Hierarchical INTegration 1.0
Test: DOUBLE
QUIPs > Higher Is Better
-O0 ....................................... 598545342 |=======================
-Og ....................................... 597234266 |=======================
-O1 ....................................... 585060029 |======================
-O2 ....................................... 599481605 |=======================
-O2 -ftree-vectorize -ftree-slp-vectorize . 602535297 |=======================
-O2 -march=znver1 ......................... 617516626 |========================
-O2 -flto ................................. 626640400 |========================
-O3 ....................................... 595428047 |=======================
-O3 -march=znver1 ......................... 589289926 |=======================
-O3 -march=znver1 -flto ................... 618644101 |========================
-Ofast -march=znver1 ...................... 605331833 |=======================
Bullet Physics Engine 2.81
Test: 136 Ragdolls
Seconds < Lower Is Better
-O0 ....................................... 3.14 |============================
-Og ....................................... 3.14 |============================
-O1 ....................................... 3.15 |============================
-O2 ....................................... 3.14 |============================
-O2 -ftree-vectorize -ftree-slp-vectorize . 3.15 |============================
-O2 -march=znver1 ......................... 3.05 |===========================
-O2 -flto ................................. 3.24 |=============================
-O3 ....................................... 3.15 |============================
-O3 -march=znver1 ......................... 3.06 |===========================
-Ofast -march=znver1 ...................... 3.06 |===========================
SVT-VP9 2019-02-17
1080p 8-bit YUV To VP9 Video Encode
Frames Per Second > Higher Is Better
-Og ....................................... 92.68 |===========================
-O2 -ftree-vectorize -ftree-slp-vectorize . 94.82 |===========================
-O2 -march=znver1 ......................... 95.91 |===========================
-O2 -flto ................................. 95.79 |===========================
-O3 -march=znver1 -flto ................... 97.26 |============================
-Ofast -march=znver1 ...................... 97.80 |============================
Bullet Physics Engine 2.81
Test: 1000 Convex
Seconds < Lower Is Better
-O0 ....................................... 5.36 |=============================
-Og ....................................... 5.37 |=============================
-O1 ....................................... 5.37 |=============================
-O2 ....................................... 5.37 |=============================
-O2 -ftree-vectorize -ftree-slp-vectorize . 5.37 |=============================
-O2 -march=znver1 ......................... 5.19 |============================
-O2 -flto ................................. 5.40 |=============================
-O3 ....................................... 5.38 |=============================
-O3 -march=znver1 ......................... 5.18 |============================
-Ofast -march=znver1 ...................... 5.18 |============================
VP9 libvpx Encoding 1.8.0
vpxenc VP9 1080p Video Encode
Frames Per Second > Higher Is Better
-Og ....................................... 20.39 |===========================
-O2 -ftree-vectorize -ftree-slp-vectorize . 20.34 |===========================
-O2 -march=znver1 ......................... 20.05 |===========================
-O3 -march=znver1 -flto ................... 20.86 |============================
-Ofast -march=znver1 ...................... 20.13 |===========================
SVT-AV1 2019-02-03
1080p 8-bit YUV To AV1 Video Encode
Frames Per Second > Higher Is Better
-O0 ....................................... 1.69 |============================
-Og ....................................... 1.73 |=============================
-O1 ....................................... 1.70 |============================
-O2 ....................................... 1.69 |============================
-O2 -ftree-vectorize -ftree-slp-vectorize . 1.67 |============================
-O2 -march=znver1 ......................... 1.70 |============================
-O2 -flto ................................. 1.69 |============================
-O3 ....................................... 1.73 |=============================
-O3 -march=znver1 ......................... 1.68 |============================
-O3 -march=znver1 -flto ................... 1.71 |=============================
-Ofast -march=znver1 ...................... 1.70 |============================
VP9 libvpx Encoding 1.8.0
vpxenc VP9 1080p Video Encode
Frames Per Second > Higher Is Better
-O0 ....................................... 12.50 |===========================
-Og ....................................... 12.52 |===========================
-O1 ....................................... 12.53 |============================
-O2 ....................................... 12.54 |============================
-O2 -ftree-vectorize -ftree-slp-vectorize . 12.56 |============================
-O2 -march=znver1 ......................... 12.42 |===========================
-O3 ....................................... 12.31 |===========================
-O3 -march=znver1 ......................... 12.41 |===========================
-O3 -march=znver1 -flto ................... 12.75 |============================
-Ofast -march=znver1 ...................... 12.37 |===========================
x265 3.0
H.265 1080p Video Encoding
Frames Per Second > Higher Is Better
-O0 ....................................... 35.00 |============================
-Og ....................................... 34.76 |===========================
-O1 ....................................... 35.62 |============================
-O2 ....................................... 34.55 |===========================
-O2 -ftree-vectorize -ftree-slp-vectorize . 35.41 |============================
-O2 -march=znver1 ......................... 34.80 |===========================
-O2 -flto ................................. 35.07 |============================
-O3 ....................................... 35.21 |============================
-O3 -march=znver1 ......................... 35.57 |============================
-Ofast -march=znver1 ...................... 34.91 |===========================
Bullet Physics Engine 2.81
Test: 3000 Fall
Seconds < Lower Is Better
-O0 ....................................... 5.14 |=============================
-Og ....................................... 5.15 |=============================
-O1 ....................................... 5.16 |=============================
-O2 ....................................... 5.14 |=============================
-O2 -ftree-vectorize -ftree-slp-vectorize . 5.14 |=============================
-O2 -march=znver1 ......................... 5.08 |============================
-O2 -flto ................................. 5.21 |=============================
-O3 ....................................... 5.16 |=============================
-O3 -march=znver1 ......................... 5.07 |============================
-Ofast -march=znver1 ...................... 5.09 |============================
Bullet Physics Engine 2.81
Test: Raytests
Seconds < Lower Is Better
-O0 ....................................... 3.11 |=============================
-Og ....................................... 3.11 |=============================
-O1 ....................................... 3.12 |=============================
-O2 ....................................... 3.12 |=============================
-O2 -ftree-vectorize -ftree-slp-vectorize . 3.11 |=============================
-O2 -march=znver1 ......................... 3.10 |=============================
-O2 -flto ................................. 3.05 |============================
-O3 ....................................... 3.11 |=============================
-O3 -march=znver1 ......................... 3.09 |=============================
-Ofast -march=znver1 ...................... 3.09 |=============================
Stockfish 9
Total Time
Nodes Per Second > Higher Is Better
-O0 ....................................... 105868175 |========================
-Og ....................................... 105709690 |========================
-O1 ....................................... 105698092 |========================
-O2 ....................................... 104480422 |========================
-O2 -ftree-vectorize -ftree-slp-vectorize . 104197865 |=======================
-O2 -march=znver1 ......................... 106084276 |========================
-O2 -flto ................................. 104536605 |========================
-O3 ....................................... 104121840 |=======================
-O3 -march=znver1 ......................... 106497994 |========================
-Ofast -march=znver1 ...................... 106507244 |========================
Bullet Physics Engine 2.81
Test: Convex Trimesh
Seconds < Lower Is Better
-O0 ....................................... 1.35 |=============================
-Og ....................................... 1.35 |=============================
-O1 ....................................... 1.35 |=============================
-O2 ....................................... 1.35 |=============================
-O2 -ftree-vectorize -ftree-slp-vectorize . 1.35 |=============================
-O2 -march=znver1 ......................... 1.32 |============================
-O2 -flto ................................. 1.33 |=============================
-O3 ....................................... 1.35 |=============================
-O3 -march=znver1 ......................... 1.32 |============================
-Ofast -march=znver1 ...................... 1.32 |============================
Bullet Physics Engine 2.81
Test: Prim Trimesh
Seconds < Lower Is Better
-O0 ....................................... 1.11 |=============================
-Og ....................................... 1.11 |=============================
-O1 ....................................... 1.11 |=============================
-O2 ....................................... 1.11 |=============================
-O2 -ftree-vectorize -ftree-slp-vectorize . 1.11 |=============================
-O2 -march=znver1 ......................... 1.11 |=============================
-O2 -flto ................................. 1.09 |============================
-O3 ....................................... 1.11 |=============================
-O3 -march=znver1 ......................... 1.11 |=============================
-Ofast -march=znver1 ...................... 1.11 |=============================
SVT-AV1 2019-02-15
1080p 8-bit YUV To AV1 Video Encode
Frames Per Second > Higher Is Better
-O0 ....................................... 5.88 |=============================
-Og ....................................... 5.86 |=============================
-O1 ....................................... 5.87 |=============================
-O2 ....................................... 5.81 |=============================
-O2 -ftree-vectorize -ftree-slp-vectorize . 5.89 |=============================
-O2 -march=znver1 ......................... 5.84 |=============================
-O2 -flto ................................. 5.90 |=============================
-O3 ....................................... 5.90 |=============================
-O3 -march=znver1 ......................... 5.89 |=============================
-O3 -march=znver1 -flto ................... 5.84 |=============================
-Ofast -march=znver1 ...................... 5.91 |=============================
Hierarchical INTegration 1.0
Test: FLOAT
QUIPs > Higher Is Better
-O0 ....................................... 267404445 |========================
-Og ....................................... 267368671 |========================
-O1 ....................................... 268455578 |========================
-O2 ....................................... 267311970 |========================
-O2 -ftree-vectorize -ftree-slp-vectorize . 267172145 |========================
-O2 -march=znver1 ......................... 267268023 |========================
-O2 -flto ................................. 268173400 |========================
-O3 ....................................... 267315647 |========================
-O3 -march=znver1 ......................... 268506472 |========================
-O3 -march=znver1 -flto ................... 267239405 |========================
-Ofast -march=znver1 ...................... 267055407 |========================
TSCP 1.81
AI Chess Performance
Nodes Per Second > Higher Is Better
-O0 ....................................... 865459 |===========================
-Og ....................................... 865187 |===========================
-O1 ....................................... 864102 |===========================
-O2 ....................................... 864373 |===========================
-O2 -ftree-vectorize -ftree-slp-vectorize . 864916 |===========================
-O2 -march=znver1 ......................... 864915 |===========================
-O2 -flto ................................. 864101 |===========================
-O3 ....................................... 864915 |===========================
-O3 -march=znver1 ......................... 865732 |===========================
-O3 -march=znver1 -flto ................... 863018 |===========================
-Ofast -march=znver1 ...................... 864373 |===========================
ctx_clock
Context Switch Time
Clocks < Lower Is Better
-Og ....................................... 132 |==============================
-O2 -ftree-vectorize -ftree-slp-vectorize . 132 |==============================
-O2 -flto ................................. 132 |==============================
-O3 -march=znver1 -flto ................... 132 |==============================
Zstd Compression 1.3.4
Compressing ubuntu-16.04.3-server-i386.img, Compression Level 19
Seconds < Lower Is Better
-O0 ....................................... 23.12 |============================
-Og ....................................... 14.39 |=================
-O1 ....................................... 14.11 |=================
-O2 ....................................... 14.48 |==================
-O2 -ftree-vectorize -ftree-slp-vectorize . 13.67 |=================
-O2 -march=znver1 ......................... 14.71 |==================
-O2 -flto ................................. 14.08 |=================
-O3 ....................................... 13.66 |=================
-O3 -march=znver1 ......................... 14.37 |=================
-O3 -march=znver1 -flto ................... 13.16 |================
-Ofast -march=znver1 ...................... 13.77 |=================
John The Ripper 1.8.0-jumbo-1
Test: Blowfish
Real C/S > Higher Is Better
-O0 ....................................... 15179 |======
-Og ....................................... 56453 |========================
-O1 ....................................... 65995 |============================
-O2 ....................................... 62718 |==========================
-O2 -ftree-vectorize -ftree-slp-vectorize . 63586 |===========================
-O2 -march=znver1 ......................... 61309 |==========================
-O2 -flto ................................. 65117 |===========================
-O3 ....................................... 65806 |============================
-O3 -march=znver1 ......................... 66823 |============================
-O3 -march=znver1 -flto ................... 58764 |=========================
-Ofast -march=znver1 ...................... 62841 |==========================
SciMark 2.0
Computational Test: Dense LU Matrix Factorization
Mflops > Higher Is Better
-O0 ....................................... 512 |===
-Og ....................................... 2539 |===============
-O1 ....................................... 3466 |=====================
-O2 ....................................... 2609 |================
-O2 -ftree-vectorize -ftree-slp-vectorize . 4396 |==========================
-O2 -march=znver1 ......................... 3231 |===================
-O2 -flto ................................. 2515 |===============
-O3 ....................................... 4307 |==========================
-O3 -march=znver1 ......................... 4851 |=============================
-O3 -march=znver1 -flto ................... 3300 |====================
-Ofast -march=znver1 ...................... 4089 |========================