AMD EPYC Compiler Tuning Ref 2 x AMD EPYC 7601 32-Core testing with a Dell 02MJ3T (1.2.5 BIOS) and Matrox G200eW3 on Ubuntu 18.04 via the Phoronix Test Suite. x86-64: Processor: 2 x AMD EPYC 7601 32-Core (64 Cores / 128 Threads), Motherboard: Dell 02MJ3T (1.2.5 BIOS), Chipset: AMD Family 17h, Memory: 16 x 32 GB DDR4-2400MT/s 36ASF4G72PZ-2G6D2, Disk: 20 x 500GB Samsung SSD 860 + 120GB SSDSCKJB120G7R, Graphics: Matrox G200eW3, Monitor: VE228, Network: 2 x Broadcom BCM57416 NetXtreme-E 10GBase-T RDMA + 2 x Broadcom NetXtreme BCM5720 PCIe OS: Ubuntu 18.04, Kernel: 5.0.0-050000rc6-generic (x86_64) 20190210, Desktop: GNOME Shell 3.28.3, Display Server: X Server, Compiler: GCC 7.3.0, File-System: ext4, Screen Resolution: 1600x1200 znver1: Processor: 2 x AMD EPYC 7601 32-Core (64 Cores / 128 Threads), Motherboard: Dell 02MJ3T (1.2.5 BIOS), Chipset: AMD Family 17h, Memory: 16 x 32 GB DDR4-2400MT/s 36ASF4G72PZ-2G6D2, Disk: 20 x 500GB Samsung SSD 860 + 120GB SSDSCKJB120G7R, Graphics: Matrox G200eW3, Monitor: VE228, Network: 2 x Broadcom BCM57416 NetXtreme-E 10GBase-T RDMA + 2 x Broadcom NetXtreme BCM5720 PCIe OS: Ubuntu 18.04, Kernel: 5.0.0-050000rc6-generic (x86_64) 20190210, Desktop: GNOME Shell 3.28.3, Display Server: X Server, Compiler: GCC 7.3.0, File-System: ext4, Screen Resolution: 1600x1200 t-test1 2017-01-13 Threads: 1 Seconds < Lower Is Better x86-64 . 21.05 |=============================================================== znver1 . 21.16 |=============================================================== t-test1 2017-01-13 Threads: 2 Seconds < Lower Is Better x86-64 . 8.31 |================================================================ znver1 . 8.25 |================================================================ Sockperf 3.4 Test: Throughput Messages Per Second > Higher Is Better x86-64 . 348584 |============================================================= znver1 . 354187 |============================================================== Sockperf 3.4 Test: Latency Ping Pong usec < Lower Is Better x86-64 . 3.66 |=============================================================== znver1 . 3.70 |================================================================ Sockperf 3.4 Test: Latency Under Load usec < Lower Is Better x86-64 . 12.86 |================================= znver1 . 24.83 |=============================================================== GNU MPC 1.1.0 Multi-Precision Benchmark Global Score > Higher Is Better x86-64 . 6107 |================================================================ znver1 . 6120 |================================================================ CloverLeaf Lagrangian-Eulerian Hydrodynamics Seconds < Lower Is Better x86-64 . 1.68 |================================================================ znver1 . 1.21 |============================================== NAMD 2.13b1 ATPase Simulation - 327,506 Atoms days/ns < Lower Is Better x86-64 . 0.44857 |============================================================= znver1 . 0.44982 |============================================================= lzbench 2017-08-08 Test: XZ 0 - Process: Compression MB/s > Higher Is Better x86-64 . 24 |================================================================== znver1 . 23 |=============================================================== lzbench 2017-08-08 Test: XZ 0 - Process: Decompression MB/s > Higher Is Better x86-64 . 75 |================================================================== znver1 . 75 |================================================================== lzbench 2017-08-08 Test: Zstd 1 - Process: Compression MB/s > Higher Is Better x86-64 . 336 |================================================================= znver1 . 334 |================================================================= lzbench 2017-08-08 Test: Zstd 1 - Process: Decompression MB/s > Higher Is Better x86-64 . 918 |================================================================= znver1 . 906 |================================================================ lzbench 2017-08-08 Test: Brotli 0 - Process: Compression MB/s > Higher Is Better x86-64 . 346 |================================================================= znver1 . 347 |================================================================= lzbench 2017-08-08 Test: Brotli 0 - Process: Decompression MB/s > Higher Is Better x86-64 . 397 |================================================================= znver1 . 397 |================================================================= lzbench 2017-08-08 Test: Libdeflate 1 - Process: Compression MB/s > Higher Is Better x86-64 . 171 |================================================================= znver1 . 167 |=============================================================== lzbench 2017-08-08 Test: Libdeflate 1 - Process: Decompression MB/s > Higher Is Better x86-64 . 801 |================================================================= znver1 . 804 |================================================================= FFTE 6.0 Test: N=256, 1D Complex FFT Routine MFLOPS > Higher Is Better x86-64 . 6702 |================================================================ Timed HMMer Search 2.3.2 Pfam Database Search Seconds < Lower Is Better x86-64 . 6.65 |=============================================================== znver1 . 6.73 |================================================================ Timed MAFFT Alignment 7.392 Multiple Sequence Alignment Seconds < Lower Is Better x86-64 . 3.84 |================================================================ znver1 . 3.85 |================================================================ BLAKE2 20170307 Cycles Per Byte < Lower Is Better x86-64 . 6.94 |================================================================ znver1 . 6.94 |================================================================ Fhourstones 3.1 Complex Connect-4 Solving Kpos / sec > Higher Is Better x86-64 . 9872 |================================================================ znver1 . 9847 |================================================================ CacheBench Test: Read MB/s > Higher Is Better x86-64 . 2216 |================================================================ znver1 . 2216 |================================================================ CacheBench Test: Write MB/s > Higher Is Better x86-64 . 21825 |=============================================================== znver1 . 21811 |=============================================================== CacheBench Test: Read / Modify / Write MB/s > Higher Is Better x86-64 . 22947 |=============================================================== znver1 . 22951 |=============================================================== LuaJIT 2.1-git Test: Composite Mflops > Higher Is Better x86-64 . 1143 |================================================================ znver1 . 1137 |================================================================ LuaJIT 2.1-git Test: Monte Carlo Mflops > Higher Is Better x86-64 . 387 |================================================================= znver1 . 387 |================================================================= LuaJIT 2.1-git Test: Fast Fourier Transform Mflops > Higher Is Better x86-64 . 248 |================================================================= znver1 . 249 |================================================================= LuaJIT 2.1-git Test: Sparse Matrix Multiply Mflops > Higher Is Better x86-64 . 921 |================================================================= znver1 . 921 |================================================================= LuaJIT 2.1-git Test: Dense LU Matrix Factorization Mflops > Higher Is Better x86-64 . 2737 |================================================================ znver1 . 2705 |=============================================================== LuaJIT 2.1-git Test: Jacobi Successive Over-Relaxation Mflops > Higher Is Better x86-64 . 1422 |================================================================ znver1 . 1421 |================================================================ SciMark 2.0 Computational Test: Composite Mflops > Higher Is Better x86-64 . 1839 |============================================================== znver1 . 1892 |================================================================ SciMark 2.0 Computational Test: Monte Carlo Mflops > Higher Is Better x86-64 . 560 |================================================================= znver1 . 197 |======================= SciMark 2.0 Computational Test: Fast Fourier Transform Mflops > Higher Is Better x86-64 . 230 |================================================================= znver1 . 227 |================================================================ SciMark 2.0 Computational Test: Sparse Matrix Multiply Mflops > Higher Is Better x86-64 . 2405 |================================================================ znver1 . 2373 |=============================================================== SciMark 2.0 Computational Test: Dense LU Matrix Factorization Mflops > Higher Is Better x86-64 . 4569 |=========================================================== znver1 . 4974 |================================================================ SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation Mflops > Higher Is Better x86-64 . 1431 |====================================================== znver1 . 1686 |================================================================ Botan 2.8.0 Test: KASUMI - Encrypt MiB/s > Higher Is Better x86-64 . 73.41 |=============================================================== znver1 . 73.19 |=============================================================== Botan 2.8.0 Test: KASUMI - Decrypt MiB/s > Higher Is Better x86-64 . 70.45 |=============================================================== znver1 . 70.83 |=============================================================== Botan 2.8.0 Test: AES-256 - Encrypt MiB/s > Higher Is Better x86-64 . 4428 |=============================================================== znver1 . 4503 |================================================================ Botan 2.8.0 Test: AES-256 - Decrypt MiB/s > Higher Is Better x86-64 . 4486 |================================================================ znver1 . 4506 |================================================================ Botan 2.8.0 Test: Twofish - Encrypt MiB/s > Higher Is Better x86-64 . 276 |================================================================ znver1 . 282 |================================================================= Botan 2.8.0 Test: Twofish - Decrypt MiB/s > Higher Is Better x86-64 . 279 |================================================================= znver1 . 279 |================================================================= Botan 2.8.0 Test: Blowfish - Encrypt MiB/s > Higher Is Better x86-64 . 213 |================================================================= znver1 . 214 |================================================================= Botan 2.8.0 Test: Blowfish - Decrypt MiB/s > Higher Is Better x86-64 . 212 |================================================================= znver1 . 212 |================================================================= Botan 2.8.0 Test: CAST-256 - Encrypt MiB/s > Higher Is Better x86-64 . 113 |================================================================= znver1 . 112 |================================================================ Botan 2.8.0 Test: CAST-256 - Decrypt MiB/s > Higher Is Better x86-64 . 113 |================================================================= znver1 . 112 |================================================================ Crafty 25.2 Elapsed Time Nodes Per Second > Higher Is Better x86-64 . 5673254 |============================================================= znver1 . 5679304 |============================================================= TSCP 1.81 AI Chess Performance Nodes Per Second > Higher Is Better x86-64 . 870965 |============================================================== znver1 . 871513 |============================================================== John The Ripper 1.8.0-jumbo-1 Test: Blowfish Real C/S > Higher Is Better x86-64 . 64750 |============================================================= znver1 . 67042 |=============================================================== John The Ripper 1.8.0-jumbo-1 Test: Traditional DES Real C/S > Higher Is Better x86-64 . 256625333 |=========================================================== znver1 . 255191909 |=========================================================== TTSIOD 3D Renderer 2.3b Phong Rendering With Soft-Shadow Mapping FPS > Higher Is Better x86-64 . 351 |=============================================================== znver1 . 361 |================================================================= SVT-AV1 2019-02-03 1080p 8-bit YUV To AV1 Video Encode Frames Per Second > Higher Is Better x86-64 . 1.68 |=============================================================== znver1 . 1.71 |================================================================ SVT-HEVC 2019-02-03 1080p 8-bit YUV To HEVC Video Encode Frames Per Second > Higher Is Better x86-64 . 161 |================================================================= znver1 . 154 |============================================================== VP9 libvpx Encoding 1.8.0 vpxenc VP9 1080p Video Encode Frames Per Second > Higher Is Better x86-64 . 12.21 |=============================================================== znver1 . 12.03 |============================================================== x264 2018-09-25 H.264 Video Encoding Frames Per Second > Higher Is Better x86-64 . 143 |================================================================= znver1 . 142 |================================================================= x265 3.0 H.265 1080p Video Encoding Frames Per Second > Higher Is Better x86-64 . 34.81 |============================================================== znver1 . 35.25 |=============================================================== GraphicsMagick 1.3.30 Operation: Swirl Iterations Per Minute > Higher Is Better x86-64 . 184 |============================================================== znver1 . 193 |================================================================= GraphicsMagick 1.3.30 Operation: Rotate Iterations Per Minute > Higher Is Better x86-64 . 180 |============================================================== znver1 . 189 |================================================================= GraphicsMagick 1.3.30 Operation: Sharpen Iterations Per Minute > Higher Is Better x86-64 . 170 |============================================================= znver1 . 182 |================================================================= GraphicsMagick 1.3.30 Operation: Enhanced Iterations Per Minute > Higher Is Better x86-64 . 178 |============================================================= znver1 . 189 |================================================================= GraphicsMagick 1.3.30 Operation: Resizing Iterations Per Minute > Higher Is Better x86-64 . 118 |============================================================== znver1 . 124 |================================================================= GraphicsMagick 1.3.30 Operation: Noise-Gaussian Iterations Per Minute > Higher Is Better x86-64 . 169 |============================================================== znver1 . 177 |================================================================= GraphicsMagick 1.3.30 Operation: HWB Color Space Iterations Per Minute > Higher Is Better x86-64 . 197 |============================================================== znver1 . 207 |================================================================= Himeno Benchmark 3.0 Poisson Pressure Solver MFLOPS > Higher Is Better x86-64 . 1004 |================================================================ znver1 . 949 |============================================================ 7-Zip Compression 16.02 Compress Speed Test MIPS > Higher Is Better x86-64 . 130837 |============================================================= znver1 . 134021 |============================================================== Timed Apache Compilation 2.4.7 Time To Compile Seconds < Lower Is Better x86-64 . 25.03 |============================================================== znver1 . 25.27 |=============================================================== Timed GCC Compilation 8.2 Time To Compile Seconds < Lower Is Better x86-64 . 821 |================================================================= znver1 . 819 |================================================================= Timed ImageMagick Compilation 6.9.0 Time To Compile Seconds < Lower Is Better x86-64 . 23.38 |=============================================================== znver1 . 23.24 |=============================================================== Timed LLVM Compilation 6.0.1 Time To Compile Seconds < Lower Is Better x86-64 . 151 |================================================================= znver1 . 151 |================================================================= Timed PHP Compilation 7.1.9 Time To Compile Seconds < Lower Is Better x86-64 . 61.22 |============================================================== znver1 . 61.94 |=============================================================== C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel Seconds < Lower Is Better x86-64 . 15.37 |=============================================================== znver1 . 13.17 |====================================================== Primesieve 7.2 1e12 Prime Number Generation Seconds < Lower Is Better x86-64 . 5.34 |================================================================ znver1 . 5.30 |================================================================ Smallpt 1.0 Global Illumination Renderer; 128 Samples Seconds < Lower Is Better x86-64 . 2.87 |================================================================ znver1 . 2.86 |================================================================ AOBench Size: 2048 x 2048 - Total Time Seconds < Lower Is Better x86-64 . 52.00 |=============================================================== znver1 . 49.84 |============================================================ Bullet Physics Engine 2.81 Test: Raytests Seconds < Lower Is Better x86-64 . 3.20 |============================================================= znver1 . 3.38 |================================================================ Bullet Physics Engine 2.81 Test: 3000 Fall Seconds < Lower Is Better x86-64 . 5.15 |================================================================ znver1 . 5.17 |================================================================ Bullet Physics Engine 2.81 Test: 1000 Stack Seconds < Lower Is Better x86-64 . 6.08 |================================================================ znver1 . 6.04 |================================================================ Bullet Physics Engine 2.81 Test: 1000 Convex Seconds < Lower Is Better x86-64 . 5.62 |======================================================== znver1 . 6.45 |================================================================ Bullet Physics Engine 2.81 Test: 136 Ragdolls Seconds < Lower Is Better x86-64 . 3.12 |================================================================ znver1 . 3.09 |=============================================================== Bullet Physics Engine 2.81 Test: Prim Trimesh Seconds < Lower Is Better x86-64 . 1.11 |=============================================================== znver1 . 1.13 |================================================================ Bullet Physics Engine 2.81 Test: Convex Trimesh Seconds < Lower Is Better x86-64 . 1.38 |=========================================================== znver1 . 1.49 |================================================================ LZMA Compression 256MB File Compression Seconds < Lower Is Better x86-64 . 341 |================================================================= znver1 . 341 |================================================================= XZ Compression 5.2.4 Compressing ubuntu-16.04.3-server-i386.img, Compression Level 9 Seconds < Lower Is Better x86-64 . 108 |============================================================== znver1 . 113 |================================================================= Zstd Compression 1.3.4 Compressing ubuntu-16.04.3-server-i386.img, Compression Level 19 Seconds < Lower Is Better x86-64 . 14.06 |=============================================================== znver1 . 13.93 |============================================================== FLAC Audio Encoding 1.3.2 WAV To FLAC Seconds < Lower Is Better x86-64 . 13.50 |============================================================= znver1 . 13.98 |=============================================================== LAME MP3 Encoding 3.100 WAV To MP3 Seconds < Lower Is Better x86-64 . 11.50 |=============================================================== znver1 . 11.49 |=============================================================== Ogg Encoding 1.3.3 WAV To Ogg Seconds < Lower Is Better x86-64 . 7.56 |================================================================ znver1 . 7.57 |================================================================ m-queens 1.2 Time To Solve Seconds < Lower Is Better x86-64 . 13.51 |=============================================================== znver1 . 13.53 |=============================================================== Mencoder 1.3.0 AVI To LAVC Seconds < Lower Is Better x86-64 . 22.19 |============================================================== znver1 . 22.37 |=============================================================== Tachyon 0.98.9 Total Time Seconds < Lower Is Better x86-64 . 1.60 |================================================================ znver1 . 1.57 |=============================================================== OpenSSL 1.1.1 RSA 4096-bit Performance Signs Per Second > Higher Is Better x86-64 . 9212 |================================================================ znver1 . 9207 |================================================================ Aircrack-ng 1.3 k/s > Higher Is Better x86-64 . 81230 |=============================================================== znver1 . 81166 |=============================================================== libjpeg-turbo tjbench 1.5.3 Test: Decompression Throughput Megapixels/sec > Higher Is Better x86-64 . 139 |================================================================ znver1 . 142 |================================================================= PostgreSQL pgbench 10.3 Scaling: Buffer Test - Test: Normal Load - Mode: Read Only TPS > Higher Is Better x86-64 . 488938 |============================================================== znver1 . 492868 |============================================================== PostgreSQL pgbench 10.3 Scaling: Buffer Test - Test: Normal Load - Mode: Read Write TPS > Higher Is Better x86-64 . 4294 |================================================================ znver1 . 3874 |========================================================== CppPerformanceBenchmarks 9 Test: Atol Seconds < Lower Is Better x86-64 . 90.37 |=============================================================== znver1 . 91.06 |=============================================================== CppPerformanceBenchmarks 9 Test: Ctype Seconds < Lower Is Better x86-64 . 45.63 |=============================================================== znver1 . 42.27 |========================================================== CppPerformanceBenchmarks 9 Test: Math Library Seconds < Lower Is Better x86-64 . 469 |================================================================= znver1 . 471 |================================================================= CppPerformanceBenchmarks 9 Test: Random Numbers Seconds < Lower Is Better x86-64 . 1370 |============================================================== znver1 . 1415 |================================================================ CppPerformanceBenchmarks 9 Test: Stepanov Vector Seconds < Lower Is Better x86-64 . 96.84 |=============================================================== znver1 . 96.92 |=============================================================== CppPerformanceBenchmarks 9 Test: Function Objects Seconds < Lower Is Better x86-64 . 20.68 |=============================================================== znver1 . 20.76 |=============================================================== CppPerformanceBenchmarks 9 Test: Stepanov Abstraction Seconds < Lower Is Better x86-64 . 37.35 |=============================================================== znver1 . 37.38 |=============================================================== Redis 4.0.8 Test: LPOP Requests Per Second > Higher Is Better x86-64 . 1767032 |============================================================ znver1 . 1801038 |============================================================= Redis 4.0.8 Test: SADD Requests Per Second > Higher Is Better x86-64 . 1359072 |======================================================= znver1 . 1513647 |============================================================= Redis 4.0.8 Test: LPUSH Requests Per Second > Higher Is Better x86-64 . 1049735 |============================================================ znver1 . 1065414 |============================================================= Redis 4.0.8 Test: GET Requests Per Second > Higher Is Better x86-64 . 1738242 |============================================================= znver1 . 1621671 |========================================================= Redis 4.0.8 Test: SET Requests Per Second > Higher Is Better x86-64 . 1198470 |============================================================= znver1 . 1203388 |============================================================= Sysbench 2018-07-28 Test: Memory Events Per Second > Higher Is Better x86-64 . 4218551 |============================================================= znver1 . 4208722 |============================================================= Sysbench 2018-07-28 Test: CPU Events Per Second > Higher Is Better x86-64 . 89025 |=============================================================== znver1 . 88575 |=============================================================== Xsbench 2017-07-06 Lookups/s > Higher Is Better x86-64 . 1486818 |============================================================= znver1 . 1498420 |============================================================= NGINX Benchmark 1.9.9 Static Web Page Serving Requests Per Second > Higher Is Better x86-64 . 20823 |=========================================================== znver1 . 22235 |=============================================================== Apache Benchmark 2.4.29 Static Web Page Serving Requests Per Second > Higher Is Better x86-64 . 16725 |=============================================================== znver1 . 16211 |============================================================= Apache Siege 2.4.29 Concurrent Users: 1 Transactions Per Second > Higher Is Better x86-64 . 6551 |=============================================================== znver1 . 6619 |================================================================ Apache Siege 2.4.29 Concurrent Users: 10 Transactions Per Second > Higher Is Better x86-64 . 19828 |=============================================================== znver1 . 19752 |=============================================================== Apache Siege 2.4.29 Concurrent Users: 50 Transactions Per Second > Higher Is Better x86-64 . 19085 |=============================================================== znver1 . 19103 |=============================================================== Apache Siege 2.4.29 Concurrent Users: 100 Transactions Per Second > Higher Is Better x86-64 . 20759 |=============================================================== znver1 . 20737 |=============================================================== Apache Siege 2.4.29 Concurrent Users: 200 Transactions Per Second > Higher Is Better x86-64 . 21765 |=============================================================== znver1 . 21527 |============================================================== Apache Siege 2.4.29 Concurrent Users: 250 Transactions Per Second > Higher Is Better x86-64 . 22158 |=============================================================== znver1 . 21897 |==============================================================