bigsur-lancium-gpu-test

2 x Intel Xeon E5-2680 v4 testing with a Quanta Cloud S2VM-MB (S2VM3A05 BIOS) and ASPEED 11GB on Ubuntu 18.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2006018-WZLA-BIGSURL38.

bigsur-lancium-gpu-testProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelDisplay ServerOpenCLCompilerFile-SystemScreen ResolutionBig Sur with 8x K80Big Sur with 8x K80 12 x Intel Xeon E5-2680 v4 @ 3.30GHz (28 Cores / 56 Threads)Quanta Cloud S2VM-MB (S2VM3A05 BIOS)Intel Xeon E7 v4/Xeon252GB500GB CT500MX500SSD1ASPEED 11GB2 x Mellanox MT27710Ubuntu 18.044.15.0-101-generic (x86_64)X ServerOpenCL 1.2 CUDA 10.2.131GCC 7.5.0 + CUDA 9.1ext41024x768GCC 6.5.0 20181026 + CUDA 9.1OpenBenchmarking.orgCompiler Details- Big Sur with 8x K80: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic --without-cuda-driver -v - Big Sur with 8x K80 1: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-as=/usr/bin/x86_64-linux-gnu-as --with-default-libstdcxx-abi=new --with-ld=/usr/bin/x86_64-linux-gnu-ld --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic -v Processor Details- Scaling Governor: intel_pstate powersave - CPU Microcode: 0xb000038Security Details- itlb_multihit: KVM: Mitigation of Split huge pages + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT vulnerable + mds: Mitigation of Clear buffers; SMT vulnerable + meltdown: Mitigation of PTI + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + tsx_async_abort: Mitigation of Clear buffers; SMT vulnerable

bigsur-lancium-gpu-testshoc: OpenCL - Triadshoc: OpenCL - FFT SPshoc: OpenCL - MD5 Hashshoc: OpenCL - Max SP Flopsshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Texture Read Bandwidthcl-mem: Copycl-mem: Readcl-mem: Writerodinia: OpenCL LavaMDrodinia: OpenCL Myocyterodinia: OpenCL Heartwallrodinia: OpenCL Particle Filternamd-cuda: ATPase Simulation - 327,506 AtomsBig Sur with 8x K80Big Sur with 8x K80 17.4435258.2713.14163687.5411.960412.4036244.225131.2145.2142.923.56488.71723.90332.7900.120847.3530255.4133.14193689.2411.959312.3627244.564131.2145.2143.123.44890.18324.23931.9170.11977OpenBenchmarking.org

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Triad

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: TriadBig Sur with 8x K80Big Sur with 8x K80 1246810SE +/- 0.0032, N = 3SE +/- 0.0184, N = 37.44357.35301. (CXX) g++ options: -O2 -lSHOCCommon -std=c++14 -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: FFT SP

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: FFT SPBig Sur with 8x K80Big Sur with 8x K80 160120180240300SE +/- 3.25, N = 5SE +/- 2.80, N = 15258.27255.411. (CXX) g++ options: -O2 -lSHOCCommon -std=c++14 -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: MD5 Hash

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: MD5 HashBig Sur with 8x K80Big Sur with 8x K80 10.70691.41382.12072.82763.5345SE +/- 0.0002, N = 3SE +/- 0.0004, N = 33.14163.14191. (CXX) g++ options: -O2 -lSHOCCommon -std=c++14 -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Max SP Flops

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Max SP FlopsBig Sur with 8x K80Big Sur with 8x K80 18001600240032004000SE +/- 1.38, N = 3SE +/- 1.39, N = 33687.543689.241. (CXX) g++ options: -O2 -lSHOCCommon -std=c++14 -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Download

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed DownloadBig Sur with 8x K80Big Sur with 8x K80 13691215SE +/- 0.06, N = 3SE +/- 0.06, N = 311.9611.961. (CXX) g++ options: -O2 -lSHOCCommon -std=c++14 -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Readback

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed ReadbackBig Sur with 8x K80Big Sur with 8x K80 13691215SE +/- 0.00, N = 3SE +/- 0.00, N = 312.4012.361. (CXX) g++ options: -O2 -lSHOCCommon -std=c++14 -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Texture Read Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Texture Read BandwidthBig Sur with 8x K80Big Sur with 8x K80 150100150200250SE +/- 0.75, N = 3SE +/- 0.74, N = 3244.23244.561. (CXX) g++ options: -O2 -lSHOCCommon -std=c++14 -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

cl-mem

Benchmark: Copy

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: CopyBig Sur with 8x K80Big Sur with 8x K80 1306090120150SE +/- 0.00, N = 3SE +/- 0.00, N = 3131.2131.21. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Read

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: ReadBig Sur with 8x K80Big Sur with 8x K80 1306090120150SE +/- 0.00, N = 3SE +/- 0.00, N = 3145.2145.21. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Write

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: WriteBig Sur with 8x K80Big Sur with 8x K80 1306090120150SE +/- 0.28, N = 3SE +/- 0.23, N = 3142.9143.11. (CC) gcc options: -O2 -flto -lOpenCL

Rodinia

Test: OpenCL LavaMD

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 2.4Test: OpenCL LavaMDBig Sur with 8x K80Big Sur with 8x K80 1612182430SE +/- 0.28, N = 6SE +/- 0.24, N = 823.5623.451. (CXX) g++ options: -O2 -lOpenCL

Rodinia

Test: OpenCL Myocyte

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 2.4Test: OpenCL MyocyteBig Sur with 8x K80Big Sur with 8x K80 120406080100SE +/- 0.24, N = 3SE +/- 1.11, N = 388.7290.181. (CXX) g++ options: -O2 -lOpenCL

Rodinia

Test: OpenCL Heartwall

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 2.4Test: OpenCL HeartwallBig Sur with 8x K80Big Sur with 8x K80 1612182430SE +/- 0.19, N = 15SE +/- 0.29, N = 523.9024.241. (CXX) g++ options: -O2 -lOpenCL

Rodinia

Test: OpenCL Particle Filter

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 2.4Test: OpenCL Particle FilterBig Sur with 8x K80Big Sur with 8x K80 1816243240SE +/- 0.15, N = 3SE +/- 0.11, N = 332.7931.921. (CXX) g++ options: -O2 -lOpenCL

NAMD CUDA

ATPase Simulation - 327,506 Atoms

OpenBenchmarking.orgdays/ns, Fewer Is BetterNAMD CUDA 2.13ATPase Simulation - 327,506 AtomsBig Sur with 8x K80Big Sur with 8x K80 10.02720.05440.08160.10880.136SE +/- 0.00065, N = 3SE +/- 0.00050, N = 30.120840.11977


Phoronix Test Suite v10.8.4