first_m100_run 2 x Intel Xeon Gold 6132 testing with a Dell PowerEdge R740 [0M27WY] (2.14.2 BIOS) and Matrox G200eW3 32GB on AlmaLinux 8.7 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2211303-NE-FIRSTM10000&grr .
first_m100_run Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Display Server OpenCL Compiler File-System Screen Resolution first m100 run on r740 2 x Intel Xeon Gold 6132 (28 Cores / 56 Threads) Dell PowerEdge R740 [0M27WY] (2.14.2 BIOS) Intel Sky Lake-E DMI3 Registers 12 x 16 GB DDR4-2666MT/s M393A2K43BB1-CTD 240GB INTEL SSDSC2KB24 Matrox G200eW3 32GB DELL U2412M 4 x Intel X710 for 10GbE SFP+ AlmaLinux 8.7 4.18.0-425.3.1.el8.x86_64 (x86_64) X Server 1.20.11 OpenCL 2.1 AMD-APP (3486.0) GCC 8.5.0 20210514 xfs 1920x1200 OpenBenchmarking.org - Transparent Huge Pages: always - --build=x86_64-redhat-linux --disable-libmpx --disable-libunwind-exceptions --enable-__cxa_atexit --enable-bootstrap --enable-cet --enable-checking=release --enable-gnu-indirect-function --enable-gnu-unique-object --enable-initfini-array --enable-languages=c,c++,fortran,lto --enable-multilib --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-arch_32=x86-64 --with-gcc-major-version-only --with-isl --with-linker-hash-style=gnu --with-tune=generic --without-cuda-driver - CPU Microcode: 0x2006e05 - Python 2.7.18 + Python 3.6.8 - itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT vulnerable + mds: Mitigation of Clear buffers; SMT vulnerable + meltdown: Mitigation of PTI + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + retbleed: Mitigation of IBRS + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of IBRS IBPB: conditional RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Mitigation of Clear buffers; SMT vulnerable
first_m100_run rodinia: OpenCL Myocyte shoc: OpenCL - Max SP Flops clpeak: Transfer Bandwidth enqueueWriteBuffer clpeak: Transfer Bandwidth enqueueReadBuffer darktable: Boat - OpenCL shoc: OpenCL - Texture Read Bandwidth clpeak: Double-Precision Double clpeak: Integer Compute INT clpeak: Global Memory Bandwidth clpeak: Single-Precision Float clpeak: Kernel Latency darktable: Masskrug - OpenCL cl-mem: Write cl-mem: Read cl-mem: Copy rodinia: OpenCL Heartwall darktable: Server Room - OpenCL shoc: OpenCL - Triad shoc: OpenCL - FFT SP darktable: Server Rack - OpenCL shoc: OpenCL - Bus Speed Readback shoc: OpenCL - Bus Speed Download shoc: OpenCL - MD5 Hash rodinia: OpenCL Leukocyte first m100 run on r740 40.576 18820013 5.63 4.82 1.406 696.907 11255.78 10328.26 938.92 22702.14 5.57 2.452 728.0 913.0 281.9 3.060 0.808 11.8420 2802.33 0.265 13.1374 13.7487 27.9275 OpenBenchmarking.org
Rodinia Test: OpenCL Myocyte OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenCL Myocyte first m100 run on r740 9 18 27 36 45 SE +/- 10.32, N = 15 40.58 1. (CXX) g++ options: -O2 -lOpenCL
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Max SP Flops OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Max SP Flops first m100 run on r740 4M 8M 12M 16M 20M SE +/- 157796.98, N = 15 18820013 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
clpeak OpenCL Test: Transfer Bandwidth enqueueWriteBuffer OpenBenchmarking.org GBPS, More Is Better clpeak OpenCL Test: Transfer Bandwidth enqueueWriteBuffer first m100 run on r740 1.2668 2.5336 3.8004 5.0672 6.334 SE +/- 0.01, N = 3 5.63 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
clpeak OpenCL Test: Transfer Bandwidth enqueueReadBuffer OpenBenchmarking.org GBPS, More Is Better clpeak OpenCL Test: Transfer Bandwidth enqueueReadBuffer first m100 run on r740 1.0845 2.169 3.2535 4.338 5.4225 SE +/- 0.01, N = 3 4.82 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
Darktable Test: Boat - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.8.1 Test: Boat - Acceleration: OpenCL first m100 run on r740 0.3164 0.6328 0.9492 1.2656 1.582 SE +/- 0.011, N = 10 1.406
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Texture Read Bandwidth OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Texture Read Bandwidth first m100 run on r740 150 300 450 600 750 SE +/- 0.63, N = 3 696.91 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
clpeak OpenCL Test: Double-Precision Double OpenBenchmarking.org GFLOPS, More Is Better clpeak OpenCL Test: Double-Precision Double first m100 run on r740 2K 4K 6K 8K 10K SE +/- 9.15, N = 3 11255.78 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
clpeak OpenCL Test: Integer Compute INT OpenBenchmarking.org GIOPS, More Is Better clpeak OpenCL Test: Integer Compute INT first m100 run on r740 2K 4K 6K 8K 10K SE +/- 9.71, N = 3 10328.26 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
clpeak OpenCL Test: Global Memory Bandwidth OpenBenchmarking.org GBPS, More Is Better clpeak OpenCL Test: Global Memory Bandwidth first m100 run on r740 200 400 600 800 1000 SE +/- 0.51, N = 3 938.92 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
clpeak OpenCL Test: Single-Precision Float OpenBenchmarking.org GFLOPS, More Is Better clpeak OpenCL Test: Single-Precision Float first m100 run on r740 5K 10K 15K 20K 25K SE +/- 7.79, N = 3 22702.14 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
clpeak OpenCL Test: Kernel Latency OpenBenchmarking.org us, Fewer Is Better clpeak OpenCL Test: Kernel Latency first m100 run on r740 1.2533 2.5066 3.7599 5.0132 6.2665 SE +/- 0.03, N = 3 5.57 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
Darktable Test: Masskrug - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.8.1 Test: Masskrug - Acceleration: OpenCL first m100 run on r740 0.5517 1.1034 1.6551 2.2068 2.7585 SE +/- 0.023, N = 3 2.452
cl-mem Benchmark: Write OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Write first m100 run on r740 160 320 480 640 800 SE +/- 0.46, N = 3 728.0 1. (CC) gcc options: -O2 -flto -lOpenCL
cl-mem Benchmark: Read OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Read first m100 run on r740 200 400 600 800 1000 SE +/- 0.92, N = 3 913.0 1. (CC) gcc options: -O2 -flto -lOpenCL
cl-mem Benchmark: Copy OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Copy first m100 run on r740 60 120 180 240 300 SE +/- 0.38, N = 3 281.9 1. (CC) gcc options: -O2 -flto -lOpenCL
Rodinia Test: OpenCL Heartwall OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenCL Heartwall first m100 run on r740 0.6885 1.377 2.0655 2.754 3.4425 SE +/- 0.010, N = 3 3.060 1. (CXX) g++ options: -O2 -lOpenCL
Darktable Test: Server Room - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.8.1 Test: Server Room - Acceleration: OpenCL first m100 run on r740 0.1818 0.3636 0.5454 0.7272 0.909 SE +/- 0.007, N = 3 0.808
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Triad OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Triad first m100 run on r740 3 6 9 12 15 SE +/- 0.01, N = 3 11.84 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: FFT SP OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: FFT SP first m100 run on r740 600 1200 1800 2400 3000 SE +/- 1.97, N = 3 2802.33 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
Darktable Test: Server Rack - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.8.1 Test: Server Rack - Acceleration: OpenCL first m100 run on r740 0.0596 0.1192 0.1788 0.2384 0.298 SE +/- 0.003, N = 4 0.265
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Readback OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Readback first m100 run on r740 3 6 9 12 15 SE +/- 0.00, N = 3 13.14 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Download OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Download first m100 run on r740 4 8 12 16 20 SE +/- 0.00, N = 3 13.75 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: MD5 Hash OpenBenchmarking.org GHash/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: MD5 Hash first m100 run on r740 7 14 21 28 35 SE +/- 0.00, N = 3 27.93 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
Phoronix Test Suite v10.8.5