a100-clpeak

2 x AMD EPYC 7742 64-Core testing with a NVIDIA DGXA100 v555.06901.0004 (0.34 BIOS) and ASPEED 40GB on Ubuntu 20.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2109277-IB-A100CLPEA16.

a100-clpeakProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelDisplay ServerDisplay DriverOpenCLCompilerFile-SystemScreen Resolution2 x AMD EPYC 7742 64-Core - ASPEED 40GB - NVIDIA2 x AMD EPYC 7742 64-Core @ 2.25GHz (128 Cores / 256 Threads)NVIDIA DGXA100 v555.06901.0004 (0.34 BIOS)AMD Starship/Matisse16 x 64 GB DDR4-3200MT/s 36ASF8G72PZ-3G2B24 x 3841GB SAMSUNG MZWLJ3T8HBLS-00007 + 2 x 1920GB SAMSUNG MZ1LB1T9HALS-00007ASPEED 40GB2 x Intel 82599ES 10-Gigabit SFI/SFP+ + 3 x Mellanox MT28908 + Intel I210Ubuntu 20.045.4.0-80-generic (x86_64)X ServerNVIDIAOpenCL 1.2 CUDA 11.2.109GCC 9.3.0 + CUDA 11.2ext4800x600OpenBenchmarking.org- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0x8301034- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected

a100-clpeakclpeak: Kernel Latencyclpeak: Integer Compute INTclpeak: Single-Precision Floatclpeak: Double-Precision Doubleclpeak: Global Memory Bandwidthclpeak: Transfer Bandwidth enqueueReadBufferclpeak: Transfer Bandwidth enqueueWriteBuffer2 x AMD EPYC 7742 64-Core - ASPEED 40GB - NVIDIA11.2616758.8416078.588912.241287.139.8118.50OpenBenchmarking.org

clpeak

OpenCL Test: Kernel Latency

OpenBenchmarking.orgus, Fewer Is BetterclpeakOpenCL Test: Kernel Latency2 x AMD EPYC 7742 64-Core - ASPEED 40GB - NVIDIA3691215SE +/- 0.07, N = 311.261. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Integer Compute INT

OpenBenchmarking.orgGIOPS, More Is BetterclpeakOpenCL Test: Integer Compute INT2 x AMD EPYC 7742 64-Core - ASPEED 40GB - NVIDIA4K8K12K16K20KSE +/- 161.41, N = 1516758.841. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Single-Precision Float

OpenBenchmarking.orgGFLOPS, More Is BetterclpeakOpenCL Test: Single-Precision Float2 x AMD EPYC 7742 64-Core - ASPEED 40GB - NVIDIA3K6K9K12K15KSE +/- 218.52, N = 316078.581. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Double-Precision Double

OpenBenchmarking.orgGFLOPS, More Is BetterclpeakOpenCL Test: Double-Precision Double2 x AMD EPYC 7742 64-Core - ASPEED 40GB - NVIDIA2K4K6K8K10KSE +/- 68.13, N = 108912.241. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Global Memory Bandwidth

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Global Memory Bandwidth2 x AMD EPYC 7742 64-Core - ASPEED 40GB - NVIDIA30060090012001500SE +/- 0.18, N = 31287.131. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Transfer Bandwidth enqueueReadBuffer

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Transfer Bandwidth enqueueReadBuffer2 x AMD EPYC 7742 64-Core - ASPEED 40GB - NVIDIA3691215SE +/- 0.10, N = 39.811. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Transfer Bandwidth enqueueWriteBuffer

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Transfer Bandwidth enqueueWriteBuffer2 x AMD EPYC 7742 64-Core - ASPEED 40GB - NVIDIA510152025SE +/- 0.08, N = 318.501. (CXX) g++ options: -O3 -rdynamic -lOpenCL


Phoronix Test Suite v10.8.0