first_m100_run 2 x Intel Xeon Gold 6132 testing with a Dell PowerEdge R740 [0M27WY] (2.14.2 BIOS) and Matrox G200eW3 32GB on AlmaLinux 8.7 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2211303-NE-FIRSTM10000&grw .
first_m100_run Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Display Server OpenCL Compiler File-System Screen Resolution first m100 run on r740 2 x Intel Xeon Gold 6132 (28 Cores / 56 Threads) Dell PowerEdge R740 [0M27WY] (2.14.2 BIOS) Intel Sky Lake-E DMI3 Registers 12 x 16 GB DDR4-2666MT/s M393A2K43BB1-CTD 240GB INTEL SSDSC2KB24 Matrox G200eW3 32GB DELL U2412M 4 x Intel X710 for 10GbE SFP+ AlmaLinux 8.7 4.18.0-425.3.1.el8.x86_64 (x86_64) X Server 1.20.11 OpenCL 2.1 AMD-APP (3486.0) GCC 8.5.0 20210514 xfs 1920x1200 OpenBenchmarking.org - Transparent Huge Pages: always - --build=x86_64-redhat-linux --disable-libmpx --disable-libunwind-exceptions --enable-__cxa_atexit --enable-bootstrap --enable-cet --enable-checking=release --enable-gnu-indirect-function --enable-gnu-unique-object --enable-initfini-array --enable-languages=c,c++,fortran,lto --enable-multilib --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-arch_32=x86-64 --with-gcc-major-version-only --with-isl --with-linker-hash-style=gnu --with-tune=generic --without-cuda-driver - CPU Microcode: 0x2006e05 - Python 2.7.18 + Python 3.6.8 - itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT vulnerable + mds: Mitigation of Clear buffers; SMT vulnerable + meltdown: Mitigation of PTI + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + retbleed: Mitigation of IBRS + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of IBRS IBPB: conditional RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Mitigation of Clear buffers; SMT vulnerable
first_m100_run darktable: Boat - OpenCL darktable: Masskrug - OpenCL darktable: Server Rack - OpenCL darktable: Server Room - OpenCL shoc: OpenCL - Triad shoc: OpenCL - FFT SP shoc: OpenCL - MD5 Hash shoc: OpenCL - Max SP Flops shoc: OpenCL - Bus Speed Download shoc: OpenCL - Bus Speed Readback shoc: OpenCL - Texture Read Bandwidth rodinia: OpenCL Myocyte rodinia: OpenCL Heartwall cl-mem: Copy cl-mem: Read cl-mem: Write clpeak: Kernel Latency clpeak: Integer Compute INT clpeak: Single-Precision Float clpeak: Double-Precision Double clpeak: Global Memory Bandwidth clpeak: Transfer Bandwidth enqueueReadBuffer clpeak: Transfer Bandwidth enqueueWriteBuffer smallpt-gpu: GPU - $VIDEO_WIDTH x $VIDEO_HEIGHT - Caustic3 first m100 run on r740 1.406 2.452 0.265 0.808 11.8420 2802.33 27.9275 18820013 13.7487 13.1374 696.907 40.576 3.060 281.9 913.0 728.0 5.57 10328.26 22702.14 11255.78 938.92 4.82 5.63 OpenBenchmarking.org
Darktable Test: Boat - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.8.1 Test: Boat - Acceleration: OpenCL first m100 run on r740 0.3164 0.6328 0.9492 1.2656 1.582 SE +/- 0.011, N = 10 1.406
Darktable Test: Masskrug - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.8.1 Test: Masskrug - Acceleration: OpenCL first m100 run on r740 0.5517 1.1034 1.6551 2.2068 2.7585 SE +/- 0.023, N = 3 2.452
Darktable Test: Server Rack - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.8.1 Test: Server Rack - Acceleration: OpenCL first m100 run on r740 0.0596 0.1192 0.1788 0.2384 0.298 SE +/- 0.003, N = 4 0.265
Darktable Test: Server Room - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.8.1 Test: Server Room - Acceleration: OpenCL first m100 run on r740 0.1818 0.3636 0.5454 0.7272 0.909 SE +/- 0.007, N = 3 0.808
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Triad OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Triad first m100 run on r740 3 6 9 12 15 SE +/- 0.01, N = 3 11.84 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: FFT SP OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: FFT SP first m100 run on r740 600 1200 1800 2400 3000 SE +/- 1.97, N = 3 2802.33 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: MD5 Hash OpenBenchmarking.org GHash/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: MD5 Hash first m100 run on r740 7 14 21 28 35 SE +/- 0.00, N = 3 27.93 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Max SP Flops OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Max SP Flops first m100 run on r740 4M 8M 12M 16M 20M SE +/- 157796.98, N = 15 18820013 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Download OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Download first m100 run on r740 4 8 12 16 20 SE +/- 0.00, N = 3 13.75 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Readback OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Readback first m100 run on r740 3 6 9 12 15 SE +/- 0.00, N = 3 13.14 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Texture Read Bandwidth OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Texture Read Bandwidth first m100 run on r740 150 300 450 600 750 SE +/- 0.63, N = 3 696.91 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
Rodinia Test: OpenCL Myocyte OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenCL Myocyte first m100 run on r740 9 18 27 36 45 SE +/- 10.32, N = 15 40.58 1. (CXX) g++ options: -O2 -lOpenCL
Rodinia Test: OpenCL Heartwall OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenCL Heartwall first m100 run on r740 0.6885 1.377 2.0655 2.754 3.4425 SE +/- 0.010, N = 3 3.060 1. (CXX) g++ options: -O2 -lOpenCL
cl-mem Benchmark: Copy OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Copy first m100 run on r740 60 120 180 240 300 SE +/- 0.38, N = 3 281.9 1. (CC) gcc options: -O2 -flto -lOpenCL
cl-mem Benchmark: Read OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Read first m100 run on r740 200 400 600 800 1000 SE +/- 0.92, N = 3 913.0 1. (CC) gcc options: -O2 -flto -lOpenCL
cl-mem Benchmark: Write OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Write first m100 run on r740 160 320 480 640 800 SE +/- 0.46, N = 3 728.0 1. (CC) gcc options: -O2 -flto -lOpenCL
clpeak OpenCL Test: Kernel Latency OpenBenchmarking.org us, Fewer Is Better clpeak OpenCL Test: Kernel Latency first m100 run on r740 1.2533 2.5066 3.7599 5.0132 6.2665 SE +/- 0.03, N = 3 5.57 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
clpeak OpenCL Test: Integer Compute INT OpenBenchmarking.org GIOPS, More Is Better clpeak OpenCL Test: Integer Compute INT first m100 run on r740 2K 4K 6K 8K 10K SE +/- 9.71, N = 3 10328.26 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
clpeak OpenCL Test: Single-Precision Float OpenBenchmarking.org GFLOPS, More Is Better clpeak OpenCL Test: Single-Precision Float first m100 run on r740 5K 10K 15K 20K 25K SE +/- 7.79, N = 3 22702.14 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
clpeak OpenCL Test: Double-Precision Double OpenBenchmarking.org GFLOPS, More Is Better clpeak OpenCL Test: Double-Precision Double first m100 run on r740 2K 4K 6K 8K 10K SE +/- 9.15, N = 3 11255.78 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
clpeak OpenCL Test: Global Memory Bandwidth OpenBenchmarking.org GBPS, More Is Better clpeak OpenCL Test: Global Memory Bandwidth first m100 run on r740 200 400 600 800 1000 SE +/- 0.51, N = 3 938.92 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
clpeak OpenCL Test: Transfer Bandwidth enqueueReadBuffer OpenBenchmarking.org GBPS, More Is Better clpeak OpenCL Test: Transfer Bandwidth enqueueReadBuffer first m100 run on r740 1.0845 2.169 3.2535 4.338 5.4225 SE +/- 0.01, N = 3 4.82 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
clpeak OpenCL Test: Transfer Bandwidth enqueueWriteBuffer OpenBenchmarking.org GBPS, More Is Better clpeak OpenCL Test: Transfer Bandwidth enqueueWriteBuffer first m100 run on r740 1.2668 2.5336 3.8004 5.0672 6.334 SE +/- 0.01, N = 3 5.63 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
Phoronix Test Suite v10.8.5