mi100-1

AMD Ryzen 5 3600 6-Core testing with a Gigabyte X570 AORUS PRO (F34 BIOS) and AMD Radeon VII 16GB on ManjaroLinux 21.1.0 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2107156-IB-2105265IB70&sor.

mi100-1ProcessorMotherboardChipsetMemoryDiskGraphicsNetworkAudioOSKernelOpenCLCompilerFile-SystemScreen ResolutionSystem LayerDisplay DriverVulkanDisplay ServerOpenGLmi100V100Radeon VII 2x16 x Intel Core (Haswell no TSX) (16 Cores)RDO OpenStack Compute (1.11.0-2.el7 BIOS)Intel 82G33/G31/P35/P31 + ICH964GB21GB QEMU HDD + 107GB QEMU HDDCirrus Logic GD 5446 32GBRed Hat Virtio deviceUbuntu 18.045.4.0-64-generic (x86_64)OpenCL 2.0 AMD-APP (3275.0)GCC 7.5.0ext41024x768KVM2 x Intel Xeon (Skylake IBRS) (2 Cores)8GB21GB QEMU HDD + 53GB QEMU HDDCirrus Logic GD 5446 8GBUbuntu 20.045.4.0-67-generic (x86_64)NVIDIAOpenCL 1.2 CUDA 11.0.2281.2.133GCC 9.3.0 + CUDA 11.2AMD Ryzen 5 3600 6-Core @ 3.60GHz (6 Cores / 12 Threads)Gigabyte X570 AORUS PRO (F34 BIOS)AMD Starship/Matisse32GB1000GB Sabrent Rocket 4.0 1TB + 240GB SanDisk SDSSDA24 + 256GB SanDisk SD8SN8U2 + 0GB Multiple Reader + 16GB SD/MMC/MS PRO + 510PFAMD Radeon VII 16GB (1801/1000MHz)AMD Vega 20 HDMI AudioIntel I211 + Intel Wi-Fi 6 AX200ManjaroLinux 21.1.05.13.1-3-MANJARO (x86_64)X Server 1.20.114.6 Mesa 21.1.4 (LLVM 12.0.0)OpenCL 2.0 AMD-APP.dbg (3275.0)GCC 11.1.0 + Clang 12.0.1f2fs2560x1440OpenBenchmarking.orgKernel Details- mi100: Transparent Huge Pages: madvise- V100: Transparent Huge Pages: madvise- Radeon VII 2x: amdgpu.ppfeaturemask=0xffffffff - Transparent Huge Pages: madviseCompiler Details- mi100: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic --without-cuda-driver -v - V100: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Radeon VII 2x: --disable-libssp --disable-libstdcxx-pch --disable-libunwind-exceptions --disable-werror --enable-__cxa_atexit --enable-cet=auto --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object --enable-install-libiberty --enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++,d --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-isl --with-linker-hash-style=gnu Processor Details- mi100: CPU Microcode: 0x1- V100: CPU Microcode: 0x1- Radeon VII 2x: Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0x8701021Python Details- mi100: Python 2.7.17 + Python 3.6.9Security Details- mi100: itlb_multihit: KVM: Vulnerable + l1tf: Mitigation of PTE Inversion + mds: Vulnerable: Clear buffers attempted no microcode; SMT Host state unknown + meltdown: Mitigation of PTI + spec_store_bypass: Vulnerable + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline STIBP: disabled RSB filling + srbds: Unknown: Dependent on hypervisor status + tsx_async_abort: Not affected - V100: itlb_multihit: KVM: Vulnerable + l1tf: Mitigation of PTE Inversion + mds: Vulnerable: Clear buffers attempted no microcode; SMT Host state unknown + meltdown: Mitigation of PTI + spec_store_bypass: Vulnerable + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline IBPB: conditional IBRS_FW STIBP: disabled RSB filling + srbds: Not affected + tsx_async_abort: Vulnerable: Clear buffers attempted no microcode; SMT Host state unknown - Radeon VII 2x: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected Graphics Details- Radeon VII 2x: GLAMOR

mi100-1shoc: OpenCL - Triadshoc: OpenCL - FFT SPshoc: OpenCL - MD5 Hashshoc: OpenCL - Max SP Flopsshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Texture Read Bandwidthcl-mem: Copycl-mem: Readcl-mem: Writerodinia: OpenCL Myocyterodinia: OpenCL Heartwalldarktable: Boat - OpenCLdarktable: Masskrug - OpenCLdarktable: Server Rack - OpenCLdarktable: Server Room - OpenCLblender: BMW27 - OpenCLclpeak: Kernel Latencyclpeak: Integer Compute INTclpeak: Single-Precision Floatclpeak: Double-Precision Doubleclpeak: Global Memory Bandwidthclpeak: Transfer Bandwidth enqueueReadBufferclpeak: Transfer Bandwidth enqueueWriteBufferdarktable: Boat - OpenCLdarktable: Masskrug - OpenCLdarktable: Server Rack - OpenCLdarktable: Server Room - OpenCLmi100V100Radeon VII 2x12.27402783.5127.89102194303313.669414.0831706.109286.8916.8730.0132.6003.1332.0085.0750.1770.86453.7617.877487.8422813.5511439.47960.154.8610.9612.26492278.0931.092814052.712.344113.17091470.52268.5780.2736.7115.4792.9191281.465.5113899.1714073.617003.99769.524.046.645.56618.1560.4141.8106.83012399.1216.823973133007.16727.1500453.429300.8808.5674.0117.97837.4813.864583.1413724.223441.77808.0520.2926.38OpenBenchmarking.org

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Triad

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Triadmi100V100Radeon VII 2x3691215SE +/- 0.1549, N = 3SE +/- 0.0037, N = 3SE +/- 0.0003, N = 312.274012.26496.83011. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: FFT SP

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: FFT SPmi100Radeon VII 2xV1006001200180024003000SE +/- 2.72, N = 3SE +/- 1.04, N = 3SE +/- 7.23, N = 32783.512399.122278.091. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: MD5 Hash

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: MD5 HashV100mi100Radeon VII 2x714212835SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 331.0927.8916.821. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Max SP Flops

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Max SP Flopsmi100Radeon VII 2xV1005M10M15M20M25MSE +/- 89939.99, N = 3SE +/- 153743.02, N = 3SE +/- 6.33, N = 321943033.07313300.014052.71. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Download

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed Downloadmi100V100Radeon VII 2x48121620SE +/- 0.0004, N = 3SE +/- 0.0001, N = 3SE +/- 0.0026, N = 313.669412.34417.16721. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Readback

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed Readbackmi100V100Radeon VII 2x48121620SE +/- 0.0033, N = 3SE +/- 0.0001, N = 3SE +/- 0.0010, N = 314.083113.17097.15001. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Texture Read Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Texture Read BandwidthV100mi100Radeon VII 2x30060090012001500SE +/- 1.76, N = 3SE +/- 0.38, N = 3SE +/- 0.91, N = 31470.52706.11453.431. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

cl-mem

Benchmark: Copy

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: CopyRadeon VII 2xmi100V10070140210280350SE +/- 0.67, N = 3SE +/- 1.71, N = 3SE +/- 0.47, N = 3300.8286.8268.51. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Read

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Readmi100Radeon VII 2xV1002004006008001000SE +/- 1.78, N = 3SE +/- 1.91, N = 3SE +/- 1.72, N = 3916.8808.5780.21. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Write

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: WriteV100mi100Radeon VII 2x160320480640800SE +/- 0.59, N = 3SE +/- 0.52, N = 3SE +/- 1.99, N = 3736.7730.0674.01. (CC) gcc options: -O2 -flto -lOpenCL

Rodinia

Test: OpenCL Myocyte

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL MyocyteV100Radeon VII 2xmi100306090120150SE +/- 0.98, N = 3SE +/- 0.63, N = 3SE +/- 3.88, N = 12115.48117.98132.60-m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl-O2 -lOpenCL-O2 -lOpenCL1. (CXX) g++ options:

Rodinia

Test: OpenCL Heartwall

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL HeartwallV100mi1000.70491.40982.11472.81963.5245SE +/- 0.012, N = 32.9193.133-m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl-O2 -lOpenCL1. (CXX) g++ options:

Darktable

Test: Boat - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 2.4.2Test: Boat - Acceleration: OpenCLmi1000.45180.90361.35541.80722.259SE +/- 0.014, N = 152.008

Darktable

Test: Masskrug - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 2.4.2Test: Masskrug - Acceleration: OpenCLmi1001.14192.28383.42574.56765.7095SE +/- 0.051, N = 35.075

Darktable

Test: Server Rack - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 2.4.2Test: Server Rack - Acceleration: OpenCLmi1000.03980.07960.11940.15920.199SE +/- 0.005, N = 150.177

Darktable

Test: Server Room - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 2.4.2Test: Server Room - Acceleration: OpenCLmi1000.19440.38880.58320.77760.972SE +/- 0.001, N = 30.864

Blender

Blend File: BMW27 - Compute: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.92Blend File: BMW27 - Compute: OpenCLRadeon VII 2xmi100V10030060090012001500SE +/- 0.39, N = 3SE +/- 2.10, N = 15SE +/- 2.77, N = 337.4853.761281.46

clpeak

OpenCL Test: Kernel Latency

OpenBenchmarking.orgus, Fewer Is BetterclpeakOpenCL Test: Kernel LatencyV100Radeon VII 2xmi10048121620SE +/- 0.06, N = 3SE +/- 0.24, N = 3SE +/- 0.64, N = 125.5113.8617.871. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Integer Compute INT

OpenBenchmarking.orgGIOPS, More Is BetterclpeakOpenCL Test: Integer Compute INTV100mi100Radeon VII 2x3K6K9K12K15KSE +/- 168.65, N = 3SE +/- 5.18, N = 3SE +/- 0.70, N = 313899.177487.844583.141. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Single-Precision Float

OpenBenchmarking.orgGFLOPS, More Is BetterclpeakOpenCL Test: Single-Precision Floatmi100V100Radeon VII 2x5K10K15K20K25KSE +/- 6.31, N = 3SE +/- 50.95, N = 3SE +/- 0.64, N = 322813.5514073.6113724.221. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Double-Precision Double

OpenBenchmarking.orgGFLOPS, More Is BetterclpeakOpenCL Test: Double-Precision Doublemi100V100Radeon VII 2x2K4K6K8K10KSE +/- 3.65, N = 3SE +/- 57.57, N = 3SE +/- 0.67, N = 311439.477003.993441.771. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Global Memory Bandwidth

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Global Memory Bandwidthmi100Radeon VII 2xV1002004006008001000SE +/- 0.94, N = 3SE +/- 0.59, N = 3SE +/- 0.50, N = 3960.15808.05769.521. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Transfer Bandwidth enqueueReadBuffer

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Transfer Bandwidth enqueueReadBufferRadeon VII 2xmi100V100510152025SE +/- 0.19, N = 3SE +/- 0.05, N = 6SE +/- 0.02, N = 320.294.864.041. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Transfer Bandwidth enqueueWriteBuffer

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Transfer Bandwidth enqueueWriteBufferRadeon VII 2xmi100V100612182430SE +/- 0.63, N = 15SE +/- 1.90, N = 15SE +/- 0.19, N = 1526.3810.966.641. (CXX) g++ options: -O3 -rdynamic -lOpenCL

Darktable

Test: Boat - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 3.0.1Test: Boat - Acceleration: OpenCLV1001.25242.50483.75725.00966.262SE +/- 0.038, N = 35.566

Darktable

Test: Masskrug - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 3.0.1Test: Masskrug - Acceleration: OpenCLV10048121620SE +/- 0.18, N = 318.16

Darktable

Test: Server Rack - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 3.0.1Test: Server Rack - Acceleration: OpenCLV1000.09320.18640.27960.37280.466SE +/- 0.008, N = 150.414

Darktable

Test: Server Room - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 3.0.1Test: Server Room - Acceleration: OpenCLV1000.40730.81461.22191.62922.0365SE +/- 0.021, N = 151.810


Phoronix Test Suite v10.8.4