mi100-1 AMD Ryzen 5 3600 6-Core testing with a Gigabyte X570 AORUS PRO (F34 BIOS) and AMD Radeon VII 16GB on ManjaroLinux 21.1.0 via the Phoronix Test Suite. mi100: Processor: 16 x Intel Core (Haswell no TSX) (16 Cores), Motherboard: RDO OpenStack Compute (1.11.0-2.el7 BIOS), Chipset: Intel 82G33/G31/P35/P31 + ICH9, Memory: 64GB, Disk: 21GB QEMU HDD + 107GB QEMU HDD, Graphics: Cirrus Logic GD 5446 32GB, Network: Red Hat Virtio device OS: Ubuntu 18.04, Kernel: 5.4.0-64-generic (x86_64), OpenCL: OpenCL 2.0 AMD-APP (3275.0), Compiler: GCC 7.5.0, File-System: ext4, Screen Resolution: 1024x768, System Layer: KVM V100: Processor: 2 x Intel Xeon (Skylake IBRS) (2 Cores), Motherboard: RDO OpenStack Compute (1.11.0-2.el7 BIOS), Chipset: Intel 82G33/G31/P35/P31 + ICH9, Memory: 8GB, Disk: 21GB QEMU HDD + 53GB QEMU HDD, Graphics: Cirrus Logic GD 5446 8GB, Network: Red Hat Virtio device OS: Ubuntu 20.04, Kernel: 5.4.0-67-generic (x86_64), Display Driver: NVIDIA, OpenCL: OpenCL 1.2 CUDA 11.0.228, Vulkan: 1.2.133, Compiler: GCC 9.3.0 + CUDA 11.2, File-System: ext4, System Layer: KVM Radeon VII 2x: Processor: AMD Ryzen 5 3600 6-Core @ 3.60GHz (6 Cores / 12 Threads), Motherboard: Gigabyte X570 AORUS PRO (F34 BIOS), Chipset: AMD Starship/Matisse, Memory: 32GB, Disk: 1000GB Sabrent Rocket 4.0 1TB + 240GB SanDisk SDSSDA24 + 256GB SanDisk SD8SN8U2 + 0GB Multiple Reader + 16GB SD/MMC/MS PRO + 510PF, Graphics: AMD Radeon VII 16GB (1801/1000MHz), Audio: AMD Vega 20 HDMI Audio, Network: Intel I211 + Intel Wi-Fi 6 AX200 OS: ManjaroLinux 21.1.0, Kernel: 5.13.1-3-MANJARO (x86_64), Display Server: X Server 1.20.11, OpenGL: 4.6 Mesa 21.1.4 (LLVM 12.0.0), OpenCL: OpenCL 2.0 AMD-APP.dbg (3275.0), Compiler: GCC 11.1.0 + Clang 12.0.1, File-System: f2fs, Screen Resolution: 2560x1440 SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Triad GB/s > Higher Is Better mi100 ......... 12.2740 |====================================================== V100 .......... 12.2649 |====================================================== Radeon VII 2x . 6.8301 |============================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: FFT SP GFLOPS > Higher Is Better mi100 ......... 2783.51 |====================================================== V100 .......... 2278.09 |============================================ Radeon VII 2x . 2399.12 |=============================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: MD5 Hash GHash/s > Higher Is Better mi100 ......... 27.89 |================================================== V100 .......... 31.09 |======================================================== Radeon VII 2x . 16.82 |============================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Max SP Flops GFLOPS > Higher Is Better mi100 ......... 21943033.0 |=================================================== V100 .......... 14052.7 | Radeon VII 2x . 7313300.0 |================= SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Download GB/s > Higher Is Better mi100 ......... 13.6694 |====================================================== V100 .......... 12.3441 |================================================= Radeon VII 2x . 7.1672 |============================ SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Readback GB/s > Higher Is Better mi100 ......... 14.0831 |====================================================== V100 .......... 13.1709 |=================================================== Radeon VII 2x . 7.1500 |=========================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Texture Read Bandwidth GB/s > Higher Is Better mi100 ......... 706.11 |========================== V100 .......... 1470.52 |====================================================== Radeon VII 2x . 453.43 |================= cl-mem 2017-01-13 Benchmark: Copy GB/s > Higher Is Better mi100 ......... 286.8 |===================================================== V100 .......... 268.5 |================================================== Radeon VII 2x . 300.8 |======================================================== cl-mem 2017-01-13 Benchmark: Read GB/s > Higher Is Better mi100 ......... 916.8 |======================================================== V100 .......... 780.2 |================================================ Radeon VII 2x . 808.5 |================================================= cl-mem 2017-01-13 Benchmark: Write GB/s > Higher Is Better mi100 ......... 730.0 |======================================================= V100 .......... 736.7 |======================================================== Radeon VII 2x . 674.0 |=================================================== Rodinia 3.1 Test: OpenCL Myocyte Seconds < Lower Is Better mi100 ......... 132.60 |======================================================= V100 .......... 115.48 |================================================ Radeon VII 2x . 117.98 |================================================= Rodinia 3.1 Test: OpenCL Heartwall Seconds < Lower Is Better mi100 . 3.133 |================================================================ V100 .. 2.919 |============================================================ Darktable 2.4.2 Test: Boat - Acceleration: OpenCL Seconds < Lower Is Better mi100 . 2.008 |================================================================ Darktable 2.4.2 Test: Masskrug - Acceleration: OpenCL Seconds < Lower Is Better mi100 . 5.075 |================================================================ Darktable 2.4.2 Test: Server Rack - Acceleration: OpenCL Seconds < Lower Is Better mi100 . 0.177 |================================================================ Darktable 2.4.2 Test: Server Room - Acceleration: OpenCL Seconds < Lower Is Better mi100 . 0.864 |================================================================ Blender 2.92 Blend File: BMW27 - Compute: OpenCL Seconds < Lower Is Better mi100 ......... 53.76 |== V100 .......... 1281.46 |====================================================== Radeon VII 2x . 37.48 |== clpeak OpenCL Test: Kernel Latency us < Lower Is Better mi100 ......... 17.87 |======================================================== V100 .......... 5.51 |================= Radeon VII 2x . 13.86 |=========================================== clpeak OpenCL Test: Integer Compute INT GIOPS > Higher Is Better mi100 ......... 7487.84 |============================= V100 .......... 13899.17 |===================================================== Radeon VII 2x . 4583.14 |================= clpeak OpenCL Test: Single-Precision Float GFLOPS > Higher Is Better mi100 ......... 22813.55 |===================================================== V100 .......... 14073.61 |================================= Radeon VII 2x . 13724.22 |================================ clpeak OpenCL Test: Double-Precision Double GFLOPS > Higher Is Better mi100 ......... 11439.47 |===================================================== V100 .......... 7003.99 |================================ Radeon VII 2x . 3441.77 |================ clpeak OpenCL Test: Global Memory Bandwidth GBPS > Higher Is Better mi100 ......... 960.15 |======================================================= V100 .......... 769.52 |============================================ Radeon VII 2x . 808.05 |============================================== clpeak OpenCL Test: Transfer Bandwidth enqueueReadBuffer GBPS > Higher Is Better mi100 ......... 4.86 |============= V100 .......... 4.04 |=========== Radeon VII 2x . 20.29 |======================================================== clpeak OpenCL Test: Transfer Bandwidth enqueueWriteBuffer GBPS > Higher Is Better mi100 ......... 10.96 |======================= V100 .......... 6.64 |============== Radeon VII 2x . 26.38 |======================================================== Darktable 3.0.1 Test: Boat - Acceleration: OpenCL Seconds < Lower Is Better V100 . 5.566 |================================================================= Darktable 3.0.1 Test: Masskrug - Acceleration: OpenCL Seconds < Lower Is Better V100 . 18.16 |================================================================= Darktable 3.0.1 Test: Server Rack - Acceleration: OpenCL Seconds < Lower Is Better V100 . 0.414 |================================================================= Darktable 3.0.1 Test: Server Room - Acceleration: OpenCL Seconds < Lower Is Better V100 . 1.810 |=================================================================