Amazon EC2 benchmarking for a future article.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2112046-TJ-2108199TJ92 Intel M6i Ice Lake vs. Graviton2 Amazon EC2 Benchmarks - Phoronix Test Suite Intel M6i Ice Lake vs. Graviton2 Amazon EC2 Benchmarks Amazon EC2 benchmarking for a future article.
HTML result view exported from: https://openbenchmarking.org/result/2112046-TJ-2108199TJ92&gru&rdt .
Intel M6i Ice Lake vs. Graviton2 Amazon EC2 Benchmarks Processor Motherboard Memory Disk Network Chipset OS Kernel Vulkan Compiler File-System System Layer m6g.metal m5.24xlarge m6i.24xlarge m6i.32xlarge m6a.24xlarge ARMv8 Neoverse-N1 (64 Cores) Amazon EC2 m6g.metal v1.0 252GB 107GB Amazon Elastic Block Store Amazon Elastic Ubuntu 20.04 5.4.0-1045-aws (aarch64) 1.0.2 GCC 9.3.0 ext4 2 x Intel Xeon Platinum 8259CL (48 Cores / 96 Threads) Amazon EC2 m5.24xlarge (1.0 BIOS) Intel 440FX 82441FX PMC 374GB 5.4.0-1045-aws (x86_64) KVM 2 x Intel Xeon Platinum 8375C (48 Cores / 96 Threads) Amazon EC2 m6i.24xlarge (1.0 BIOS) 372GB 2 x Intel Xeon Platinum 8375C (64 Cores / 128 Threads) Amazon EC2 m6i.32xlarge (1.0 BIOS) 496GB AMD EPYC 7R13 (48 Cores / 96 Threads) Amazon EC2 m6a.24xlarge (1.0 BIOS) 370GB 5.11.0-1020-aws (x86_64) OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - m6g.metal: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v - m5.24xlarge: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - m6i.24xlarge: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - m6i.32xlarge: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - m6a.24xlarge: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Java Details - m6g.metal, m5.24xlarge: OpenJDK Runtime Environment (build 11.0.11+9-Ubuntu-0ubuntu2.20.04) Security Details - m6g.metal: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected - m5.24xlarge: itlb_multihit: KVM: Vulnerable + l1tf: Mitigation of PTE Inversion + mds: Vulnerable: Clear buffers attempted no microcode; SMT Host state unknown + meltdown: Mitigation of PTI + spec_store_bypass: Vulnerable + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline STIBP: disabled RSB filling + srbds: Not affected + tsx_async_abort: Not affected - m6i.24xlarge: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected - m6i.32xlarge: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected - m6a.24xlarge: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected Processor Details - m5.24xlarge: CPU Microcode: 0x5003005 - m6i.24xlarge: CPU Microcode: 0xd0002b1 - m6i.32xlarge: CPU Microcode: 0xd0002b1 - m6a.24xlarge: CPU Microcode: 0xa001143
Intel M6i Ice Lake vs. Graviton2 Amazon EC2 Benchmarks minife: Small hpcg: coremark: CoreMark Size 666 - Iterations Per Second stockfish: Total Time asmfish: 1024 Hash Memory, 26 Depth rocksdb: Rand Read npb: BT.C npb: CG.C npb: EP.C npb: EP.D npb: FT.C npb: MG.C lulesh: pennant: sedovbig pennant: leblancbig tnn: CPU - DenseNet tnn: CPU - MobileNet v2 tnn: CPU - SqueezeNet v2 tnn: CPU - SqueezeNet v1.1 incompact3d: input.i3d 129 Cells Per Direction incompact3d: input.i3d 193 Cells Per Direction povray: Trace Time m-queens: Time To Solve n-queens: Elapsed Time m6g.metal m5.24xlarge m6i.24xlarge m6i.32xlarge m6a.24xlarge 23848.2 21.4570 1236555.803752 96657449 104868482 270332614 24464.82 13438.71 2218.08 2233.14 21850.28 25872.77 16867.370 15.41301 11.29726 3288.829 365.839 105.072 341.315 5.18380547 23.2480348 57.439 19.430 3.761 14007.1 26.8884 1451630.519049 105658561 115160185 194576074 104533.15 30206.03 4777.13 4875.47 50800.74 65732.22 16272.587 25.22237 10.03026 3797.589 426.094 93.266 394.508 4.76513163 21.4682878 42.964 22.341 3.892 19946.4 37.2245 1607068.543334 136790816 136656900 231109408 136431.11 33146.76 6426.32 6752.38 70031.71 88248.73 22519.115 17.24513 6.413928 3522.581 350.378 70.932 357.721 3.49298970 14.9058580 10.631 16.068 3.144 18797.6 39.1328 2128843.094210 169762583 169329043 298073130 202455.31 38736.50 8107.02 8765.66 102661.18 117771.92 35739.821 15.14226 5.105541 3524.690 349.167 70.518 357.689 2.95411030 12.5943028 9.015 12.334 2.312 11197.8 17.0281 1850380.040001 127232305 130814939 269589717 92962.40 23714.82 2759.76 2780.04 54008.74 49638.45 16745.120 12.27951 7.508991 2866.455 312.780 78.819 282.809 5.42248488 24.9592756 10.726 12.334 2.382 OpenBenchmarking.org
miniFE Problem Size: Small OpenBenchmarking.org CG Mflops, More Is Better miniFE 2.2 Problem Size: Small m6g.metal m5.24xlarge m6i.24xlarge m6i.32xlarge m6a.24xlarge 5K 10K 15K 20K 25K SE +/- 5.77, N = 3 SE +/- 284.82, N = 15 SE +/- 817.97, N = 15 SE +/- 590.35, N = 15 SE +/- 149.74, N = 15 23848.2 14007.1 19946.4 18797.6 11197.8 1. (CXX) g++ options: -O3 -fopenmp -pthread -lmpi_cxx -lmpi
High Performance Conjugate Gradient OpenBenchmarking.org GFLOP/s, More Is Better High Performance Conjugate Gradient 3.1 m6g.metal m5.24xlarge m6i.24xlarge m6i.32xlarge m6a.24xlarge 9 18 27 36 45 SE +/- 0.01, N = 3 SE +/- 0.17, N = 3 SE +/- 0.01, N = 3 SE +/- 0.13, N = 3 SE +/- 0.01, N = 3 21.46 26.89 37.22 39.13 17.03 1. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -pthread -lmpi_cxx -lmpi
Coremark CoreMark Size 666 - Iterations Per Second OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second m6g.metal m5.24xlarge m6i.24xlarge m6i.32xlarge m6a.24xlarge 500K 1000K 1500K 2000K 2500K SE +/- 279.74, N = 3 SE +/- 5103.90, N = 3 SE +/- 8144.03, N = 3 SE +/- 2056.46, N = 3 SE +/- 16150.98, N = 3 1236555.80 1451630.52 1607068.54 2128843.09 1850380.04 1. (CC) gcc options: -O2 -lrt" -lrt
Stockfish Total Time OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 13 Total Time m6g.metal m5.24xlarge m6i.24xlarge m6i.32xlarge m6a.24xlarge 40M 80M 120M 160M 200M SE +/- 692846.14, N = 3 SE +/- 1176206.14, N = 15 SE +/- 185784.28, N = 3 SE +/- 554216.71, N = 3 SE +/- 1261758.38, N = 6 96657449 105658561 136790816 169762583 127232305 -m64 -msse -msse3 -mpopcnt -mavx2 -mavx512f -mavx512bw -msse4.1 -mssse3 -msse2 -mbmi2 -m64 -msse -msse3 -mpopcnt -mavx2 -mavx512f -mavx512bw -mavx512vnni -mavx512dq -mavx512vl -msse4.1 -mssse3 -msse2 -mbmi2 -m64 -msse -msse3 -mpopcnt -mavx2 -mavx512f -mavx512bw -mavx512vnni -mavx512dq -mavx512vl -msse4.1 -mssse3 -msse2 -mbmi2 -m64 -msse -msse3 -mpopcnt -mavx2 -msse4.1 -mssse3 -msse2 1. (CXX) g++ options: -lgcov -lpthread -fno-exceptions -std=c++17 -fprofile-use -fno-peel-loops -fno-tracer -pedantic -O3 -flto -flto=jobserver
asmFish 1024 Hash Memory, 26 Depth OpenBenchmarking.org Nodes/second, More Is Better asmFish 2018-07-23 1024 Hash Memory, 26 Depth m6g.metal m5.24xlarge m6i.24xlarge m6i.32xlarge m6a.24xlarge 40M 80M 120M 160M 200M SE +/- 1056350.97, N = 3 SE +/- 806502.14, N = 12 SE +/- 1425879.59, N = 3 SE +/- 870392.07, N = 3 SE +/- 280698.15, N = 3 104868482 115160185 136656900 169329043 130814939
Facebook RocksDB Test: Random Read OpenBenchmarking.org Op/s, More Is Better Facebook RocksDB 6.22.1 Test: Random Read m6g.metal m5.24xlarge m6i.24xlarge m6i.32xlarge m6a.24xlarge 60M 120M 180M 240M 300M SE +/- 1242136.86, N = 3 SE +/- 128495.22, N = 3 SE +/- 1027109.62, N = 3 SE +/- 2848830.44, N = 3 SE +/- 749179.54, N = 3 270332614 194576074 231109408 298073130 269589717 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
NAS Parallel Benchmarks Test / Class: BT.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: BT.C m6g.metal m5.24xlarge m6i.24xlarge m6i.32xlarge m6a.24xlarge 40K 80K 120K 160K 200K SE +/- 12.96, N = 3 SE +/- 119.93, N = 3 SE +/- 147.64, N = 3 SE +/- 156.09, N = 3 SE +/- 140.47, N = 3 24464.82 104533.15 136431.11 202455.31 92962.40 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. Open MPI 4.0.3
NAS Parallel Benchmarks Test / Class: CG.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: CG.C m6g.metal m5.24xlarge m6i.24xlarge m6i.32xlarge m6a.24xlarge 8K 16K 24K 32K 40K SE +/- 27.23, N = 3 SE +/- 40.34, N = 3 SE +/- 54.87, N = 3 SE +/- 259.41, N = 3 SE +/- 167.20, N = 3 13438.71 30206.03 33146.76 38736.50 23714.82 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. Open MPI 4.0.3
NAS Parallel Benchmarks Test / Class: EP.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.C m6g.metal m5.24xlarge m6i.24xlarge m6i.32xlarge m6a.24xlarge 2K 4K 6K 8K 10K SE +/- 9.83, N = 3 SE +/- 188.76, N = 15 SE +/- 82.22, N = 3 SE +/- 60.61, N = 11 SE +/- 6.49, N = 3 2218.08 4777.13 6426.32 8107.02 2759.76 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. Open MPI 4.0.3
NAS Parallel Benchmarks Test / Class: EP.D OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.D m6g.metal m5.24xlarge m6i.24xlarge m6i.32xlarge m6a.24xlarge 2K 4K 6K 8K 10K SE +/- 1.71, N = 3 SE +/- 257.35, N = 12 SE +/- 78.27, N = 15 SE +/- 77.50, N = 15 SE +/- 4.54, N = 3 2233.14 4875.47 6752.38 8765.66 2780.04 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. Open MPI 4.0.3
NAS Parallel Benchmarks Test / Class: FT.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: FT.C m6g.metal m5.24xlarge m6i.24xlarge m6i.32xlarge m6a.24xlarge 20K 40K 60K 80K 100K SE +/- 2.47, N = 3 SE +/- 441.43, N = 15 SE +/- 54.84, N = 3 SE +/- 1220.06, N = 4 SE +/- 84.37, N = 3 21850.28 50800.74 70031.71 102661.18 54008.74 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. Open MPI 4.0.3
NAS Parallel Benchmarks Test / Class: MG.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: MG.C m6g.metal m5.24xlarge m6i.24xlarge m6i.32xlarge m6a.24xlarge 30K 60K 90K 120K 150K SE +/- 37.90, N = 3 SE +/- 462.23, N = 3 SE +/- 6.50, N = 3 SE +/- 270.09, N = 3 SE +/- 36.70, N = 3 25872.77 65732.22 88248.73 117771.92 49638.45 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. Open MPI 4.0.3
LULESH OpenBenchmarking.org z/s, More Is Better LULESH 2.0.3 m6g.metal m5.24xlarge m6i.24xlarge m6i.32xlarge m6a.24xlarge 8K 16K 24K 32K 40K SE +/- 6.43, N = 3 SE +/- 6.88, N = 3 SE +/- 64.35, N = 3 SE +/- 50.85, N = 3 SE +/- 110.64, N = 3 16867.37 16272.59 22519.12 35739.82 16745.12 1. (CXX) g++ options: -O3 -fopenmp -lm -pthread -lmpi_cxx -lmpi
Pennant Test: sedovbig OpenBenchmarking.org Hydro Cycle Time - Seconds, Fewer Is Better Pennant 1.0.1 Test: sedovbig m6g.metal m5.24xlarge m6i.24xlarge m6i.32xlarge m6a.24xlarge 6 12 18 24 30 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 15.41 25.22 17.25 15.14 12.28 1. (CXX) g++ options: -fopenmp -pthread -lmpi_cxx -lmpi
Pennant Test: leblancbig OpenBenchmarking.org Hydro Cycle Time - Seconds, Fewer Is Better Pennant 1.0.1 Test: leblancbig m6g.metal m5.24xlarge m6i.24xlarge m6i.32xlarge m6a.24xlarge 3 6 9 12 15 SE +/- 0.003475, N = 3 SE +/- 0.019405, N = 3 SE +/- 0.016794, N = 3 SE +/- 0.014585, N = 3 SE +/- 0.005942, N = 3 11.297260 10.030260 6.413928 5.105541 7.508991 1. (CXX) g++ options: -fopenmp -pthread -lmpi_cxx -lmpi
TNN Target: CPU - Model: DenseNet OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: DenseNet m6g.metal m5.24xlarge m6i.24xlarge m6i.32xlarge m6a.24xlarge 800 1600 2400 3200 4000 SE +/- 4.12, N = 3 SE +/- 2.50, N = 3 SE +/- 3.59, N = 3 SE +/- 0.73, N = 3 SE +/- 1.59, N = 3 3288.83 3797.59 3522.58 3524.69 2866.46 MIN: 3237.1 / MAX: 3327.59 MIN: 3754.49 / MAX: 4081.73 MIN: 3492.16 / MAX: 3621.39 MIN: 3482.31 / MAX: 3882.56 MIN: 2825.11 / MAX: 2989.74 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: MobileNet v2 m6g.metal m5.24xlarge m6i.24xlarge m6i.32xlarge m6a.24xlarge 90 180 270 360 450 SE +/- 0.26, N = 3 SE +/- 0.52, N = 3 SE +/- 0.24, N = 3 SE +/- 0.19, N = 3 SE +/- 1.81, N = 3 365.84 426.09 350.38 349.17 312.78 MIN: 364.34 / MAX: 367.32 MIN: 422.84 / MAX: 477.54 MIN: 348.46 / MAX: 395.91 MIN: 346.78 / MAX: 378.54 MIN: 309.33 / MAX: 382.03 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
TNN Target: CPU - Model: SqueezeNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v2 m6g.metal m5.24xlarge m6i.24xlarge m6i.32xlarge m6a.24xlarge 20 40 60 80 100 SE +/- 0.14, N = 3 SE +/- 0.01, N = 3 SE +/- 0.53, N = 3 SE +/- 0.10, N = 3 SE +/- 0.09, N = 3 105.07 93.27 70.93 70.52 78.82 MIN: 104.55 / MAX: 105.78 MIN: 93.03 / MAX: 93.7 MIN: 70.08 / MAX: 72.98 MIN: 70.09 / MAX: 71.46 MIN: 78.48 / MAX: 84.22 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v1.1 m6g.metal m5.24xlarge m6i.24xlarge m6i.32xlarge m6a.24xlarge 90 180 270 360 450 SE +/- 0.83, N = 3 SE +/- 0.14, N = 3 SE +/- 0.23, N = 3 SE +/- 0.00, N = 3 SE +/- 0.32, N = 3 341.32 394.51 357.72 357.69 282.81 MIN: 338.67 / MAX: 344.11 MIN: 393.65 / MAX: 397.65 MIN: 356.98 / MAX: 361.71 MIN: 357.05 / MAX: 360.09 MIN: 281.46 / MAX: 315.41 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
Xcompact3d Incompact3d Input: input.i3d 129 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 129 Cells Per Direction m6g.metal m5.24xlarge m6i.24xlarge m6i.32xlarge m6a.24xlarge 1.2201 2.4402 3.6603 4.8804 6.1005 SE +/- 0.00359720, N = 3 SE +/- 0.01271715, N = 3 SE +/- 0.02606307, N = 3 SE +/- 0.01051693, N = 3 SE +/- 0.01191253, N = 3 5.18380547 4.76513163 3.49298970 2.95411030 5.42248488 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
Xcompact3d Incompact3d Input: input.i3d 193 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 193 Cells Per Direction m6g.metal m5.24xlarge m6i.24xlarge m6i.32xlarge m6a.24xlarge 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.11, N = 3 SE +/- 0.08, N = 3 SE +/- 0.02, N = 3 SE +/- 0.25, N = 3 23.25 21.47 14.91 12.59 24.96 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
POV-Ray Trace Time OpenBenchmarking.org Seconds, Fewer Is Better POV-Ray 3.7.0.7 Trace Time m6g.metal m5.24xlarge m6i.24xlarge m6i.32xlarge m6a.24xlarge 13 26 39 52 65 SE +/- 0.915, N = 15 SE +/- 4.195, N = 15 SE +/- 0.059, N = 3 SE +/- 0.105, N = 3 SE +/- 0.016, N = 3 57.439 42.964 10.631 9.015 10.726 -lSM -lICE -lX11 -march=native -lSM -lICE -lX11 -march=native -lSM -lICE -lX11 -march=native -march=native 1. (CXX) g++ options: -pipe -O3 -ffast-math -pthread -ltiff -ljpeg -lpng -lz -lrt -lm -lboost_thread -lboost_system
m-queens Time To Solve OpenBenchmarking.org Seconds, Fewer Is Better m-queens 1.2 Time To Solve m6g.metal m5.24xlarge m6i.24xlarge m6i.32xlarge m6a.24xlarge 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.13, N = 3 SE +/- 0.19, N = 3 SE +/- 0.14, N = 4 SE +/- 0.02, N = 3 19.43 22.34 16.07 12.33 12.33 1. (CXX) g++ options: -fopenmp -O2 -march=native
N-Queens Elapsed Time OpenBenchmarking.org Seconds, Fewer Is Better N-Queens 1.0 Elapsed Time m6g.metal m5.24xlarge m6i.24xlarge m6i.32xlarge m6a.24xlarge 0.8757 1.7514 2.6271 3.5028 4.3785 SE +/- 0.001, N = 3 SE +/- 0.028, N = 3 SE +/- 0.041, N = 3 SE +/- 0.052, N = 15 SE +/- 0.007, N = 3 3.761 3.892 3.144 2.312 2.382 1. (CC) gcc options: -static -fopenmp -O3 -march=native
Phoronix Test Suite v10.8.4