AMD EPYC vs. Intel Xeon vs. Amazon EC2 Cloud Tests for a future article. Some initial AMD EPYC 7601 tests on Ubuntu 17.04 with Linux 4.13. Tests for a future article on Phoronix.com. Benchmarks by Michael Larabel.
HTML result view exported from: https://openbenchmarking.org/result/1709189-TY-EPYCCLOUD78&grs&sor&rro .
AMD EPYC vs. Intel Xeon vs. Amazon EC2 Cloud Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Display Driver Compiler File-System Screen Resolution Desktop System Layer AMD EPYC 7601 AMD EPYC 7601 (NUMA Interleave All) 2 x Intel Xeon Gold 6138 m4.10xlarge m4.16xlarge c4.4xlarge c4.8xlarge c3.8xlarge r4.16xlarge AMD EPYC 7601 32-Core @ 2.20GHz (64 Cores) TYAN B8026T70AE24HR AMD Device 1450 129024MB 234GB ASPEED ASPEED Family Acer P243W Broadcom Limited NetXtreme BCM5720 Gigabit PCIe Ubuntu 17.04 4.13.0-041300-generic (x86_64) modesetting 1.19.3 GCC 6.3.0 20170406 ext4 1920x1200 Unity 7.5.0 2 x Intel Xeon Gold 6138 @ 3.70GHz (80 Cores) TYAN S7106 Intel Device 2020 96256MB 256GB Samsung SSD 850 + 2000GB Seagate ST2000DM006-2DM1 + 2 x 120GB TOSHIBA-TR150 Intel I210 Gigabit Connection 2 x Intel Xeon E5-2676 v3 @ 3.00GHz (40 Cores) Xen HVM domU Intel 440FX- 82441FX PMC 161792MB 8GB Cirrus Logic GD 5446 Intel 82599 Virtual Function Ubuntu 16.04 4.4.0-1022-aws (x86_64) GCC 5.4.0 20160609 Xen HVM domU 4.2.amazon 2 x Intel Xeon E5-2686 v4 @ 3.00GHz (64 Cores) 258048MB Device 1d0f:ec20 Intel Xeon E5-2666 v3 @ 2.90GHz (16 Cores) 30720MB Intel 82599 Virtual Function 2 x Intel Xeon E5-2666 v3 @ 3.50GHz (36 Cores) 60416MB 2 x Intel Xeon E5-2680 v2 @ 2.79GHz (32 Cores) 2 x Intel Xeon E5-2686 v4 @ 3.00GHz (64 Cores) 492544MB Device 1d0f:ec20 OpenBenchmarking.org Compiler Details - AMD EPYC 7601: --build=x86_64-linux-gnu --disable-browser-plugin --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-default-libstdcxx-abi=new --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic -v - AMD EPYC 7601 (NUMA Interleave All): --build=x86_64-linux-gnu --disable-browser-plugin --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-default-libstdcxx-abi=new --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic -v - 2 x Intel Xeon Gold 6138: --build=x86_64-linux-gnu --disable-browser-plugin --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-default-libstdcxx-abi=new --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic -v - m4.10xlarge: --build=x86_64-linux-gnu --disable-browser-plugin --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-default-libstdcxx-abi=new --with-multilib-list=m32,m64,mx32 --with-tune=generic -v - m4.16xlarge: --build=x86_64-linux-gnu --disable-browser-plugin --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-default-libstdcxx-abi=new --with-multilib-list=m32,m64,mx32 --with-tune=generic -v - c4.4xlarge: --build=x86_64-linux-gnu --disable-browser-plugin --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-default-libstdcxx-abi=new --with-multilib-list=m32,m64,mx32 --with-tune=generic -v - c4.8xlarge: --build=x86_64-linux-gnu --disable-browser-plugin --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-default-libstdcxx-abi=new --with-multilib-list=m32,m64,mx32 --with-tune=generic -v - c3.8xlarge: --build=x86_64-linux-gnu --disable-browser-plugin --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-default-libstdcxx-abi=new --with-multilib-list=m32,m64,mx32 --with-tune=generic -v - r4.16xlarge: --build=x86_64-linux-gnu --disable-browser-plugin --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-default-libstdcxx-abi=new --with-multilib-list=m32,m64,mx32 --with-tune=generic -v Processor Details - AMD EPYC 7601: Scaling Governor: acpi-cpufreq ondemand - AMD EPYC 7601 (NUMA Interleave All): Scaling Governor: acpi-cpufreq ondemand - 2 x Intel Xeon Gold 6138: Scaling Governor: intel_pstate powersave - m4.10xlarge: Scaling Governor: intel_pstate powersave - m4.16xlarge: Scaling Governor: intel_pstate powersave - c4.8xlarge: Scaling Governor: intel_pstate powersave - r4.16xlarge: Scaling Governor: intel_pstate powersave
AMD EPYC vs. Intel Xeon vs. Amazon EC2 Cloud npb: EP.C c-ray: Total Time openssl: RSA 4096-bit Performance rodinia: OpenMP LavaMD primesieve: 1e12 Prime Number Generation john-the-ripper: Blowfish npb: LU.A build-llvm: Time To Compile npb: LU.C build-linux-kernel: Time To Compile parboil: OpenMP Stencil x264: H.264 Video Encoding rodinia: OpenMP Streamcluster parboil: OpenMP LBM AMD EPYC 7601 AMD EPYC 7601 (NUMA Interleave All) 2 x Intel Xeon Gold 6138 m4.10xlarge m4.16xlarge c4.4xlarge c4.8xlarge c3.8xlarge r4.16xlarge 1610.17 2.84 3294.53 30.96 14.08 29553 63042.80 175.11 46983.34 37.40 13.82 292.63 23.15 50.95 1607.68 2.85 3306.37 30.14 14.06 34335 62479.08 192.26 50046.90 39.28 7.75 288.28 14.66 38.37 1815.89 2.84 4826.70 31.44 11.81 30373 53974.49 134.64 50072.25 30.54 7.75 310.79 22.52 50.04 629.38 6.45 2229.10 56.47 22.18 23520 37949.85 213.94 36509.86 41.84 11.47 264.84 23.66 74.43 1036.50 4.21 3835.70 34.19 14.39 39039 66710.16 147.24 43175.90 30.62 7.63 304.36 20.21 47.45 302.17 13.22 1070.07 115.91 45.96 11382 20546.79 435.36 18060.63 73.54 12.86 257.83 25.67 114.31 672.88 6.08 2382.30 52.77 20.75 25152 209.76 39.89 11.86 281.33 23.45 65.38 565.48 8.54 1836.53 124.85 24.76 19532 34478.44 290.03 33661.53 51.38 11.20 303.73 15.78 75.51 1045.49 4.16 3861.60 34.86 14.63 38527 67748.67 147.57 45966.13 30.80 7.66 317.57 16.23 51.78 OpenBenchmarking.org
NAS Parallel Benchmarks Test / Class: EP.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: EP.C c4.4xlarge c3.8xlarge m4.10xlarge c4.8xlarge m4.16xlarge r4.16xlarge AMD EPYC 7601 (NUMA Interleave All) AMD EPYC 7601 2 x Intel Xeon Gold 6138 400 800 1200 1600 2000 SE +/- 0.85, N = 3 SE +/- 1.50, N = 3 SE +/- 1.36, N = 3 SE +/- 1.49, N = 3 SE +/- 4.41, N = 3 SE +/- 1.67, N = 3 SE +/- 0.90, N = 3 SE +/- 0.32, N = 3 SE +/- 34.76, N = 6 302.17 565.48 629.38 672.88 1036.50 1045.49 1607.68 1610.17 1815.89 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. c4.4xlarge: Open MPI 1.10.2 3. c3.8xlarge: Open MPI 1.10.2 4. m4.10xlarge: Open MPI 1.10.2 5. c4.8xlarge: Open MPI 1.10.2 6. m4.16xlarge: Open MPI 1.10.2 7. r4.16xlarge: Open MPI 1.10.2 8. AMD EPYC 7601 (NUMA Interleave All): Open MPI 2.0.2 9. AMD EPYC 7601: Open MPI 2.0.2 10. 2 x Intel Xeon Gold 6138: Open MPI 2.0.2
C-Ray Total Time OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time c4.4xlarge c3.8xlarge m4.10xlarge c4.8xlarge m4.16xlarge r4.16xlarge AMD EPYC 7601 (NUMA Interleave All) 2 x Intel Xeon Gold 6138 AMD EPYC 7601 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 13.22 8.54 6.45 6.08 4.21 4.16 2.85 2.84 2.84 1. (CC) gcc options: -lm -lpthread -O3
OpenSSL RSA 4096-bit Performance OpenBenchmarking.org Signs Per Second, More Is Better OpenSSL 1.0.1g RSA 4096-bit Performance c4.4xlarge c3.8xlarge m4.10xlarge c4.8xlarge AMD EPYC 7601 AMD EPYC 7601 (NUMA Interleave All) m4.16xlarge r4.16xlarge 2 x Intel Xeon Gold 6138 1000 2000 3000 4000 5000 SE +/- 0.20, N = 3 SE +/- 4.67, N = 3 SE +/- 1.32, N = 3 SE +/- 4.10, N = 3 SE +/- 14.45, N = 3 SE +/- 13.22, N = 3 SE +/- 3.53, N = 3 SE +/- 3.38, N = 3 SE +/- 23.22, N = 3 1070.07 1836.53 2229.10 2382.30 3294.53 3306.37 3835.70 3861.60 4826.70 1. (CC) gcc options: -m64 -O3 -lssl -lcrypto -ldl
Rodinia Test: OpenMP LavaMD OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 2.4 Test: OpenMP LavaMD c3.8xlarge c4.4xlarge m4.10xlarge c4.8xlarge r4.16xlarge m4.16xlarge 2 x Intel Xeon Gold 6138 AMD EPYC 7601 AMD EPYC 7601 (NUMA Interleave All) 30 60 90 120 150 SE +/- 0.10, N = 3 SE +/- 0.12, N = 3 SE +/- 0.17, N = 3 SE +/- 0.15, N = 3 SE +/- 0.06, N = 3 SE +/- 0.15, N = 3 SE +/- 0.10, N = 3 SE +/- 0.01, N = 3 SE +/- 0.15, N = 3 124.85 115.91 56.47 52.77 34.86 34.19 31.44 30.96 30.14 1. (CXX) g++ options: -O2 -lOpenCL
Primesieve 1e12 Prime Number Generation OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 5.4.2 1e12 Prime Number Generation c4.4xlarge c3.8xlarge m4.10xlarge c4.8xlarge r4.16xlarge m4.16xlarge AMD EPYC 7601 AMD EPYC 7601 (NUMA Interleave All) 2 x Intel Xeon Gold 6138 10 20 30 40 50 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 45.96 24.76 22.18 20.75 14.63 14.39 14.08 14.06 11.81 1. (CXX) g++ options: -O2 -fopenmp
John The Ripper Test: Blowfish OpenBenchmarking.org Real C/S, More Is Better John The Ripper 1.8.0 Test: Blowfish c4.4xlarge c3.8xlarge m4.10xlarge c4.8xlarge AMD EPYC 7601 2 x Intel Xeon Gold 6138 AMD EPYC 7601 (NUMA Interleave All) r4.16xlarge m4.16xlarge 8K 16K 24K 32K 40K SE +/- 13.00, N = 3 SE +/- 73.10, N = 3 SE +/- 181.43, N = 3 SE +/- 2253.07, N = 6 SE +/- 2076.33, N = 6 SE +/- 764.17, N = 6 SE +/- 51.33, N = 3 SE +/- 25.17, N = 3 11382 19532 23520 25152 29553 30373 34335 38527 39039 1. (CC) gcc options: -fopenmp -lcrypt
NAS Parallel Benchmarks Test / Class: LU.A OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: LU.A c4.4xlarge c3.8xlarge m4.10xlarge 2 x Intel Xeon Gold 6138 AMD EPYC 7601 (NUMA Interleave All) AMD EPYC 7601 m4.16xlarge r4.16xlarge 15K 30K 45K 60K 75K SE +/- 22.31, N = 3 SE +/- 104.91, N = 3 SE +/- 902.21, N = 6 SE +/- 3783.72, N = 6 SE +/- 437.89, N = 3 SE +/- 1051.01, N = 4 SE +/- 637.44, N = 3 SE +/- 1660.00, N = 6 20546.79 34478.44 37949.85 53974.49 62479.08 63042.80 66710.16 67748.67 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. c4.4xlarge: Open MPI 1.10.2 3. c3.8xlarge: Open MPI 1.10.2 4. m4.10xlarge: Open MPI 1.10.2 5. 2 x Intel Xeon Gold 6138: Open MPI 2.0.2 6. AMD EPYC 7601 (NUMA Interleave All): Open MPI 2.0.2 7. AMD EPYC 7601: Open MPI 2.0.2 8. m4.16xlarge: Open MPI 1.10.2 9. r4.16xlarge: Open MPI 1.10.2
Timed LLVM Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 4.0.1 Time To Compile c4.4xlarge c3.8xlarge m4.10xlarge c4.8xlarge AMD EPYC 7601 (NUMA Interleave All) AMD EPYC 7601 r4.16xlarge m4.16xlarge 2 x Intel Xeon Gold 6138 90 180 270 360 450 SE +/- 1.06, N = 3 SE +/- 0.11, N = 3 SE +/- 1.94, N = 3 SE +/- 1.88, N = 3 SE +/- 2.34, N = 3 SE +/- 3.02, N = 4 SE +/- 1.87, N = 3 SE +/- 1.14, N = 3 SE +/- 0.81, N = 3 435.36 290.03 213.94 209.76 192.26 175.11 147.57 147.24 134.64
NAS Parallel Benchmarks Test / Class: LU.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: LU.C c4.4xlarge c3.8xlarge m4.10xlarge m4.16xlarge r4.16xlarge AMD EPYC 7601 AMD EPYC 7601 (NUMA Interleave All) 2 x Intel Xeon Gold 6138 11K 22K 33K 44K 55K SE +/- 20.94, N = 3 SE +/- 598.90, N = 3 SE +/- 143.51, N = 3 SE +/- 341.99, N = 3 SE +/- 398.25, N = 3 SE +/- 747.98, N = 3 SE +/- 25.06, N = 3 SE +/- 637.78, N = 3 18060.63 33661.53 36509.86 43175.90 45966.13 46983.34 50046.90 50072.25 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. c4.4xlarge: Open MPI 1.10.2 3. c3.8xlarge: Open MPI 1.10.2 4. m4.10xlarge: Open MPI 1.10.2 5. m4.16xlarge: Open MPI 1.10.2 6. r4.16xlarge: Open MPI 1.10.2 7. AMD EPYC 7601: Open MPI 2.0.2 8. AMD EPYC 7601 (NUMA Interleave All): Open MPI 2.0.2 9. 2 x Intel Xeon Gold 6138: Open MPI 2.0.2
Timed Linux Kernel Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 4.9 Time To Compile c4.4xlarge c3.8xlarge m4.10xlarge c4.8xlarge AMD EPYC 7601 (NUMA Interleave All) AMD EPYC 7601 r4.16xlarge m4.16xlarge 2 x Intel Xeon Gold 6138 16 32 48 64 80 SE +/- 0.82, N = 3 SE +/- 0.78, N = 3 SE +/- 0.63, N = 5 SE +/- 0.67, N = 4 SE +/- 0.58, N = 5 SE +/- 0.51, N = 6 SE +/- 0.51, N = 6 SE +/- 0.52, N = 6 SE +/- 0.89, N = 6 73.54 51.38 41.84 39.89 39.28 37.40 30.80 30.62 30.54
Parboil Test: OpenMP Stencil OpenBenchmarking.org Seconds, Fewer Is Better Parboil 2.5 Test: OpenMP Stencil AMD EPYC 7601 c4.4xlarge c4.8xlarge m4.10xlarge c3.8xlarge 2 x Intel Xeon Gold 6138 AMD EPYC 7601 (NUMA Interleave All) r4.16xlarge m4.16xlarge 4 8 12 16 20 SE +/- 0.68, N = 6 SE +/- 0.11, N = 3 SE +/- 0.31, N = 6 SE +/- 0.08, N = 3 SE +/- 0.27, N = 6 SE +/- 0.17, N = 6 SE +/- 0.00, N = 3 SE +/- 0.09, N = 3 SE +/- 0.15, N = 3 13.82 12.86 11.86 11.47 11.20 7.75 7.75 7.66 7.63 1. (CXX) g++ options: -lm -lpthread -lgomp -ffast-math -fopenmp
x264 H.264 Video Encoding OpenBenchmarking.org Frames Per Second, More Is Better x264 2017-09-08 H.264 Video Encoding c4.4xlarge m4.10xlarge c4.8xlarge AMD EPYC 7601 (NUMA Interleave All) AMD EPYC 7601 c3.8xlarge m4.16xlarge 2 x Intel Xeon Gold 6138 r4.16xlarge 70 140 210 280 350 SE +/- 0.88, N = 3 SE +/- 5.12, N = 3 SE +/- 5.81, N = 6 SE +/- 0.45, N = 3 SE +/- 1.27, N = 3 SE +/- 3.63, N = 3 SE +/- 3.42, N = 3 SE +/- 3.63, N = 3 SE +/- 5.32, N = 4 257.83 264.84 281.33 288.28 292.63 303.73 304.36 310.79 317.57 -lavformat -lavcodec -lavutil -lswscale -lavformat -lavcodec -lavutil -lswscale 1. (CC) gcc options: -ldl -m64 -lm -lpthread -O3 -ffast-math -std=gnu99 -fPIC -fomit-frame-pointer -fno-tree-vectorize
Rodinia Test: OpenMP Streamcluster OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 2.4 Test: OpenMP Streamcluster c4.4xlarge m4.10xlarge c4.8xlarge AMD EPYC 7601 2 x Intel Xeon Gold 6138 m4.16xlarge r4.16xlarge c3.8xlarge AMD EPYC 7601 (NUMA Interleave All) 6 12 18 24 30 SE +/- 0.53, N = 6 SE +/- 0.36, N = 3 SE +/- 1.06, N = 6 SE +/- 1.43, N = 6 SE +/- 0.42, N = 3 SE +/- 1.20, N = 6 SE +/- 0.67, N = 6 SE +/- 0.84, N = 6 SE +/- 0.27, N = 6 25.67 23.66 23.45 23.15 22.52 20.21 16.23 15.78 14.66 1. (CXX) g++ options: -O2 -lOpenCL
Parboil Test: OpenMP LBM OpenBenchmarking.org Seconds, Fewer Is Better Parboil 2.5 Test: OpenMP LBM c4.4xlarge c3.8xlarge m4.10xlarge c4.8xlarge r4.16xlarge AMD EPYC 7601 2 x Intel Xeon Gold 6138 m4.16xlarge AMD EPYC 7601 (NUMA Interleave All) 30 60 90 120 150 SE +/- 0.37, N = 3 SE +/- 1.54, N = 6 SE +/- 3.88, N = 6 SE +/- 1.81, N = 6 SE +/- 3.00, N = 6 SE +/- 0.66, N = 3 SE +/- 1.14, N = 6 SE +/- 3.52, N = 6 SE +/- 0.26, N = 3 114.31 75.51 74.43 65.38 51.78 50.95 50.04 47.45 38.37 1. (CXX) g++ options: -lm -lpthread -lgomp -ffast-math -fopenmp
Phoronix Test Suite v10.8.5