g-64cpu-236mem-8v100-3ssd Processor: Intel Xeon (32 Cores / 64 Threads), Motherboard: Google Compute Engine n1-standard-64, Memory: 236GB, Disk: 3 x 403GB nvme_card + 137GB PersistentDisk, Graphics: Tesla V100-SXM2-16GB
OS: Ubuntu 20.04, Kernel: 5.4.0-1036-gcp (x86_64), Desktop: GNOME Shell 3.36.4, Display Server: X Server 1.20.9, Display Driver: NVIDIA, OpenCL: OpenCL 1.2 CUDA 11.2.109, Vulkan: 1.2.155, Compiler: GCC 9.3.0 + CUDA 10.1, File-System: ext4, System Layer: KVM
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vDisk Notes: NONE / relatime,rw,stripe=384 / raid0 nvme0n3[2] nvme0n2[1] nvme0n1[0] Block Size: 4096Processor Notes: CPU Microcode: 0x1Python Notes: Python 3.8.5Security Notes: itlb_multihit: Not affected + l1tf: Mitigation of PTE Inversion + mds: Mitigation of Clear buffers; SMT Host state unknown + meltdown: Mitigation of PTI + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
OpenBenchmarking.org MB/s, More Is Better CacheBench Write Cache g-64cpu-236mem-8v100-3ssd 4K 8K 12K 16K 20K SE +/- 31.83, N = 3 21003.15 MIN: 18716.01 / MAX: 22747.26 1. (CC) gcc options: -lrt
OpenBenchmarking.org GFLOPS, More Is Better clpeak OpenCL Test: Single-Precision Float g-64cpu-236mem-8v100-3ssd 3K 6K 9K 12K 15K SE +/- 117.34, N = 10 15509.44 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
OpenBenchmarking.org GFLOPS, More Is Better clpeak OpenCL Test: Double-Precision Double g-64cpu-236mem-8v100-3ssd 1700 3400 5100 6800 8500 SE +/- 56.94, N = 11 7748.08 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
OpenBenchmarking.org GBPS, More Is Better clpeak OpenCL Test: Global Memory Bandwidth g-64cpu-236mem-8v100-3ssd 170 340 510 680 850 SE +/- 0.59, N = 3 768.35 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
FinanceBench FinanceBench is a collection of financial program benchmarks with support for benchmarking on the GPU via OpenCL and CPU benchmarking with OpenMP. The FinanceBench test cases are focused on Black-Sholes-Merton Process with Analytic European Option engine, QMC (Sobol) Monte-Carlo method (Equity Option Example), Bonds Fixed-rate bond with flat forward curve, and Repo Securities repurchase agreement. FinanceBench was originally written by the Cavazos Lab at University of Delaware. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-07-25 Benchmark: Black-Scholes OpenCL g-64cpu-236mem-8v100-3ssd 0.2995 0.599 0.8985 1.198 1.4975 SE +/- 0.013, N = 5 1.331 1. (CXX) g++ options: -O3 -march=native -fopenmp
Flexible IO Tester FIO, the Flexible I/O Tester, is an advanced Linux disk benchmark supporting multiple I/O engines and a wealth of options. FIO was written by Jens Axboe for testing of the Linux I/O subsystem and schedulers. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better Flexible IO Tester 3.25 Type: Random Read - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 2MB - Disk Target: Default Test Directory g-64cpu-236mem-8v100-3ssd 500 1000 1500 2000 2500 2108 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
OpenBenchmarking.org IOPS, More Is Better Flexible IO Tester 3.25 Type: Random Read - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 2MB - Disk Target: Default Test Directory g-64cpu-236mem-8v100-3ssd 200 400 600 800 1000 SE +/- 0.33, N = 3 1050 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
OpenBenchmarking.org MB/s, More Is Better Flexible IO Tester 3.25 Type: Random Read - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 4KB - Disk Target: Default Test Directory g-64cpu-236mem-8v100-3ssd 200 400 600 800 1000 SE +/- 10.69, N = 3 872 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
OpenBenchmarking.org IOPS, More Is Better Flexible IO Tester 3.25 Type: Random Read - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 4KB - Disk Target: Default Test Directory g-64cpu-236mem-8v100-3ssd 50K 100K 150K 200K 250K SE +/- 2848.00, N = 3 223333 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
OpenBenchmarking.org MB/s, More Is Better Flexible IO Tester 3.25 Type: Random Write - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 2MB - Disk Target: Default Test Directory g-64cpu-236mem-8v100-3ssd 300 600 900 1200 1500 SE +/- 0.58, N = 3 1172 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
OpenBenchmarking.org IOPS, More Is Better Flexible IO Tester 3.25 Type: Random Write - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 2MB - Disk Target: Default Test Directory g-64cpu-236mem-8v100-3ssd 130 260 390 520 650 583 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
OpenBenchmarking.org MB/s, More Is Better Flexible IO Tester 3.25 Type: Random Write - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 4KB - Disk Target: Default Test Directory g-64cpu-236mem-8v100-3ssd 200 400 600 800 1000 SE +/- 4.16, N = 3 871 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
OpenBenchmarking.org IOPS, More Is Better Flexible IO Tester 3.25 Type: Random Write - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 4KB - Disk Target: Default Test Directory g-64cpu-236mem-8v100-3ssd 50K 100K 150K 200K 250K SE +/- 1154.70, N = 3 223000 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
OpenBenchmarking.org MB/s, More Is Better Flexible IO Tester 3.25 Type: Sequential Read - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 2MB - Disk Target: Default Test Directory g-64cpu-236mem-8v100-3ssd 500 1000 1500 2000 2500 SE +/- 0.33, N = 3 2109 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
OpenBenchmarking.org IOPS, More Is Better Flexible IO Tester 3.25 Type: Sequential Read - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 2MB - Disk Target: Default Test Directory g-64cpu-236mem-8v100-3ssd 200 400 600 800 1000 SE +/- 0.33, N = 3 1051 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
OpenBenchmarking.org MB/s, More Is Better Flexible IO Tester 3.25 Type: Sequential Read - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 4KB - Disk Target: Default Test Directory g-64cpu-236mem-8v100-3ssd 140 280 420 560 700 SE +/- 1.73, N = 3 657 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
OpenBenchmarking.org IOPS, More Is Better Flexible IO Tester 3.25 Type: Sequential Read - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 4KB - Disk Target: Default Test Directory g-64cpu-236mem-8v100-3ssd 40K 80K 120K 160K 200K SE +/- 333.33, N = 3 168333 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
OpenBenchmarking.org MB/s, More Is Better Flexible IO Tester 3.25 Type: Sequential Write - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 2MB - Disk Target: Default Test Directory g-64cpu-236mem-8v100-3ssd 300 600 900 1200 1500 1172 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
OpenBenchmarking.org IOPS, More Is Better Flexible IO Tester 3.25 Type: Sequential Write - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 2MB - Disk Target: Default Test Directory g-64cpu-236mem-8v100-3ssd 130 260 390 520 650 583 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
OpenBenchmarking.org MB/s, More Is Better Flexible IO Tester 3.25 Type: Sequential Write - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 4KB - Disk Target: Default Test Directory g-64cpu-236mem-8v100-3ssd 200 400 600 800 1000 SE +/- 11.05, N = 4 906 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
OpenBenchmarking.org IOPS, More Is Better Flexible IO Tester 3.25 Type: Sequential Write - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 4KB - Disk Target: Default Test Directory g-64cpu-236mem-8v100-3ssd 50K 100K 150K 200K 250K SE +/- 2798.81, N = 4 232000 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
OpenBenchmarking.org Files/s, More Is Better FS-Mark 3.3 Test: 5000 Files, 1MB Size, 4 Threads g-64cpu-236mem-8v100-3ssd 40 80 120 160 200 SE +/- 0.87, N = 3 182.7 1. (CC) gcc options: -static
OpenBenchmarking.org Files/s, More Is Better FS-Mark 3.3 Test: 4000 Files, 32 Sub Dirs, 1MB Size g-64cpu-236mem-8v100-3ssd 20 40 60 80 100 SE +/- 0.52, N = 3 104.1 1. (CC) gcc options: -static
OpenBenchmarking.org Files/s, More Is Better FS-Mark 3.3 Test: 1000 Files, 1MB Size, No Sync/FSync g-64cpu-236mem-8v100-3ssd 300 600 900 1200 1500 SE +/- 3.41, N = 3 1256.5 1. (CC) gcc options: -static
IOR IOR is a parallel I/O storage benchmark making use of MPI with a particular focus on HPC (High Performance Computing) systems. IOR is developed at the Lawrence Livermore National Laboratory (LLNL). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 2MB - Disk Target: Default Test Directory g-64cpu-236mem-8v100-3ssd 60 120 180 240 300 SE +/- 1.44, N = 3 270.11 MIN: 44.75 / MAX: 579.19 1. (CC) gcc options: -O2 -lm -pthread -lmpi
OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 4MB - Disk Target: Default Test Directory g-64cpu-236mem-8v100-3ssd 70 140 210 280 350 SE +/- 3.31, N = 3 330.23 MIN: 50.91 / MAX: 685.98 1. (CC) gcc options: -O2 -lm -pthread -lmpi
OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 8MB - Disk Target: Default Test Directory g-64cpu-236mem-8v100-3ssd 80 160 240 320 400 SE +/- 2.57, N = 15 386.94 MIN: 132 / MAX: 737.75 1. (CC) gcc options: -O2 -lm -pthread -lmpi
OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 16MB - Disk Target: Default Test Directory g-64cpu-236mem-8v100-3ssd 90 180 270 360 450 SE +/- 4.97, N = 4 419.04 MIN: 133.79 / MAX: 875.6 1. (CC) gcc options: -O2 -lm -pthread -lmpi
OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 32MB - Disk Target: Default Test Directory g-64cpu-236mem-8v100-3ssd 90 180 270 360 450 SE +/- 6.46, N = 9 436.63 MIN: 66.42 / MAX: 841.25 1. (CC) gcc options: -O2 -lm -pthread -lmpi
OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 64MB - Disk Target: Default Test Directory g-64cpu-236mem-8v100-3ssd 80 160 240 320 400 SE +/- 2.57, N = 3 389.06 MIN: 102.12 / MAX: 696.26 1. (CC) gcc options: -O2 -lm -pthread -lmpi
OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 256MB - Disk Target: Default Test Directory g-64cpu-236mem-8v100-3ssd 110 220 330 440 550 SE +/- 3.70, N = 3 511.41 MIN: 281.12 / MAX: 642.02 1. (CC) gcc options: -O2 -lm -pthread -lmpi
OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 512MB - Disk Target: Default Test Directory g-64cpu-236mem-8v100-3ssd 120 240 360 480 600 SE +/- 2.68, N = 3 556.02 MIN: 399.88 / MAX: 653.46 1. (CC) gcc options: -O2 -lm -pthread -lmpi
OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 1024MB - Disk Target: Default Test Directory g-64cpu-236mem-8v100-3ssd 150 300 450 600 750 SE +/- 2.42, N = 3 686.48 MIN: 599.17 / MAX: 738.51 1. (CC) gcc options: -O2 -lm -pthread -lmpi
Kvazaar This is a test of Kvazaar as a CPU-based H.265 video encoder written in the C programming language and optimized in Assembly. Kvazaar is the winner of the 2016 ACM Open-Source Software Competition and developed at the Ultra Video Group, Tampere University, Finland. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Slow g-64cpu-236mem-8v100-3ssd 2 4 6 8 10 SE +/- 0.03, N = 3 8.90 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Medium g-64cpu-236mem-8v100-3ssd 3 6 9 12 15 SE +/- 0.00, N = 3 9.10 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Slow g-64cpu-236mem-8v100-3ssd 6 12 18 24 30 SE +/- 0.03, N = 3 24.99 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Medium g-64cpu-236mem-8v100-3ssd 6 12 18 24 30 SE +/- 0.04, N = 3 25.70 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Very Fast g-64cpu-236mem-8v100-3ssd 5 10 15 20 25 SE +/- 0.03, N = 3 21.04 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Ultra Fast g-64cpu-236mem-8v100-3ssd 8 16 24 32 40 SE +/- 0.04, N = 3 33.36 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Very Fast g-64cpu-236mem-8v100-3ssd 14 28 42 56 70 SE +/- 0.14, N = 3 60.78 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast g-64cpu-236mem-8v100-3ssd 20 40 60 80 100 SE +/- 0.33, N = 3 111.67 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy, Fixed Block Size - Array Size: 1024 MiB g-64cpu-236mem-8v100-3ssd 1000 2000 3000 4000 5000 SE +/- 5.47, N = 3 4673.44 1. (CC) gcc options: -O3 -march=native
NAMD NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD was developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org days/ns, Fewer Is Better NAMD 2.13b1 ATPase Simulation - 327,506 Atoms g-64cpu-236mem-8v100-3ssd 0.1689 0.3378 0.5067 0.6756 0.8445 SE +/- 0.00212, N = 3 0.75058
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 g-64cpu-236mem-8v100-3ssd 3 6 9 12 15 SE +/- 0.14, N = 4 10.72 MIN: 10 / MAX: 12.52 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 g-64cpu-236mem-8v100-3ssd 3 6 9 12 15 SE +/- 0.08, N = 4 9.74 MIN: 9.23 / MAX: 11.34 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: shufflenet-v2 g-64cpu-236mem-8v100-3ssd 3 6 9 12 15 SE +/- 0.09, N = 4 10.25 MIN: 9.57 / MAX: 11.45 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: mnasnet g-64cpu-236mem-8v100-3ssd 3 6 9 12 15 SE +/- 0.07, N = 4 9.94 MIN: 9.26 / MAX: 26.34 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: efficientnet-b0 g-64cpu-236mem-8v100-3ssd 3 6 9 12 15 SE +/- 0.08, N = 4 12.89 MIN: 12.24 / MAX: 16.29 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: blazeface g-64cpu-236mem-8v100-3ssd 1.1183 2.2366 3.3549 4.4732 5.5915 SE +/- 0.07, N = 4 4.97 MIN: 4.57 / MAX: 8.01 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: googlenet g-64cpu-236mem-8v100-3ssd 5 10 15 20 25 SE +/- 0.15, N = 4 22.58 MIN: 21.69 / MAX: 24.61 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: vgg16 g-64cpu-236mem-8v100-3ssd 10 20 30 40 50 SE +/- 0.41, N = 4 44.30 MIN: 41.96 / MAX: 59.88 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: resnet18 g-64cpu-236mem-8v100-3ssd 4 8 12 16 20 SE +/- 0.17, N = 4 14.80 MIN: 13.88 / MAX: 30.07 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: alexnet g-64cpu-236mem-8v100-3ssd 3 6 9 12 15 SE +/- 0.08, N = 4 9.28 MIN: 8.72 / MAX: 10.65 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: resnet50 g-64cpu-236mem-8v100-3ssd 7 14 21 28 35 SE +/- 0.17, N = 4 28.85 MIN: 27.64 / MAX: 33.48 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: yolov4-tiny g-64cpu-236mem-8v100-3ssd 8 16 24 32 40 SE +/- 0.63, N = 4 35.98 MIN: 33.87 / MAX: 52.54 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: squeezenet_ssd g-64cpu-236mem-8v100-3ssd 6 12 18 24 30 SE +/- 0.32, N = 4 27.12 MIN: 25.88 / MAX: 29.7 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: regnety_400m g-64cpu-236mem-8v100-3ssd 15 30 45 60 75 SE +/- 0.29, N = 4 68.78 MIN: 65.39 / MAX: 87.03 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test measures the RSA 4096-bit performance of OpenSSL. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Signs Per Second, More Is Better OpenSSL 1.1.1 RSA 4096-bit Performance g-64cpu-236mem-8v100-3ssd 900 1800 2700 3600 4500 SE +/- 4.46, N = 3 4250.3 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL g-64cpu-236mem-8v100-3ssd 600 1200 1800 2400 3000 SE +/- 3.39, N = 3 2904.48
OpenBenchmarking.org FPS, More Is Better PlaidML FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL g-64cpu-236mem-8v100-3ssd 700 1400 2100 2800 3500 SE +/- 1.58, N = 3 3390.59
OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL g-64cpu-236mem-8v100-3ssd 50 100 150 200 250 SE +/- 0.12, N = 3 245.66
PostMark This is a test of NetApp's PostMark benchmark designed to simulate small-file testing similar to the tasks endured by web and mail servers. This test profile will set PostMark to perform 25,000 transactions with 500 files simultaneously with the file sizes ranging between 5 and 512 kilobytes. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org TPS, More Is Better PostMark 1.51 Disk Transaction Performance g-64cpu-236mem-8v100-3ssd 800 1600 2400 3200 4000 3623 1. (CC) gcc options: -O3
POV-Ray This is a test of POV-Ray, the Persistence of Vision Raytracer. POV-Ray is used to create 3D graphics using ray-tracing. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better POV-Ray 3.7.0.7 Trace Time g-64cpu-236mem-8v100-3ssd 6 12 18 24 30 SE +/- 0.24, N = 3 23.46 1. (CXX) g++ options: -pipe -O3 -ffast-math -march=native -pthread -lSM -lICE -lX11 -lIlmImf -lImath -lHalf -lIex -lIexMath -lIlmThread -lpthread -ltiff -ljpeg -lpng -lz -lrt -lm -lboost_thread -lboost_system
OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Copy - Benchmark: Integer g-64cpu-236mem-8v100-3ssd 3K 6K 9K 12K 15K SE +/- 2.54, N = 3 12444.10 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Scale - Benchmark: Integer g-64cpu-236mem-8v100-3ssd 4K 8K 12K 16K 20K SE +/- 16.62, N = 3 17371.93 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Triad - Benchmark: Integer g-64cpu-236mem-8v100-3ssd 4K 8K 12K 16K 20K SE +/- 14.90, N = 3 17345.33 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Average - Benchmark: Integer g-64cpu-236mem-8v100-3ssd 3K 6K 9K 12K 15K SE +/- 5.77, N = 3 15941.58 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Add - Benchmark: Floating Point g-64cpu-236mem-8v100-3ssd 3K 6K 9K 12K 15K SE +/- 22.25, N = 3 14486.90 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Copy - Benchmark: Floating Point g-64cpu-236mem-8v100-3ssd 3K 6K 9K 12K 15K SE +/- 17.50, N = 3 12394.96 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Scale - Benchmark: Floating Point g-64cpu-236mem-8v100-3ssd 3K 6K 9K 12K 15K SE +/- 18.70, N = 3 13447.33 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Triad - Benchmark: Floating Point g-64cpu-236mem-8v100-3ssd 4K 8K 12K 16K 20K SE +/- 25.02, N = 3 18499.20 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Average - Benchmark: Floating Point g-64cpu-236mem-8v100-3ssd 3K 6K 9K 12K 15K SE +/- 90.68, N = 3 14699.54 1. (CC) gcc options: -O3 -march=native
Rodinia Rodinia is a suite focused upon accelerating compute-intensive applications with accelerators. CUDA, OpenMP, and OpenCL parallel models are supported by the included applications. This profile utilizes the OpenCL and OpenMP test binaries at the moment. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 2.4 Test: OpenMP LavaMD g-64cpu-236mem-8v100-3ssd 3 6 9 12 15 SE +/- 0.03, N = 3 12.75 1. (CXX) g++ options: -O2 -lOpenCL
OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 2.4 Test: OpenMP CFD Solver g-64cpu-236mem-8v100-3ssd 3 6 9 12 15 SE +/- 0.14, N = 3 11.31 1. (CXX) g++ options: -O2 -lOpenCL
OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenCL Particle Filter g-64cpu-236mem-8v100-3ssd 0.9576 1.9152 2.8728 3.8304 4.788 SE +/- 0.034, N = 13 4.256 1. (CXX) g++ options: -O2 -lOpenCL
OpenBenchmarking.org Seconds, Fewer Is Better SQLite 3.30.1 Threads / Copies: 8 g-64cpu-236mem-8v100-3ssd 70 140 210 280 350 SE +/- 3.91, N = 3 328.33 1. (CC) gcc options: -O2 -lz -lm -ldl -lpthread
OpenBenchmarking.org Seconds, Fewer Is Better SQLite 3.30.1 Threads / Copies: 32 g-64cpu-236mem-8v100-3ssd 110 220 330 440 550 SE +/- 0.86, N = 3 486.79 1. (CC) gcc options: -O2 -lz -lm -ldl -lpthread
OpenBenchmarking.org Seconds, Fewer Is Better SQLite 3.30.1 Threads / Copies: 64 g-64cpu-236mem-8v100-3ssd 130 260 390 520 650 SE +/- 2.40, N = 3 603.43 1. (CC) gcc options: -O2 -lz -lm -ldl -lpthread
OpenBenchmarking.org Seconds, Fewer Is Better SQLite 3.30.1 Threads / Copies: 128 g-64cpu-236mem-8v100-3ssd 160 320 480 640 800 SE +/- 3.25, N = 3 743.49 1. (CC) gcc options: -O2 -lz -lm -ldl -lpthread
Stockfish This is a test of Stockfish, an advanced C++11 chess benchmark that can scale up to 128 CPU cores. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 9 Total Time g-64cpu-236mem-8v100-3ssd 13M 26M 39M 52M 65M SE +/- 97295.82, N = 3 60962172 1. (CXX) g++ options: -m64 -lpthread -fno-exceptions -std=c++11 -pedantic -O3 -msse -msse3 -mpopcnt -flto
ViennaCL ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile uses ViennaCL OpenCL support and runs the included computational benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFLOPS, More Is Better ViennaCL 1.4.2 OpenCL LU Factorization g-64cpu-236mem-8v100-3ssd 11 22 33 44 55 SE +/- 0.09, N = 3 47.39 1. (CXX) g++ options: -rdynamic -lOpenCL
x264 This is a simple test of the x264 encoder run on the CPU (OpenCL support disabled) with a sample video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better x264 2018-09-25 H.264 Video Encoding g-64cpu-236mem-8v100-3ssd 30 60 90 120 150 SE +/- 1.09, N = 3 132.47 1. (CC) gcc options: -ldl -lavformat -lavcodec -lavutil -lswscale -m64 -lm -lpthread -O3 -ffast-math -std=gnu99 -fPIC -fomit-frame-pointer -fno-tree-vectorize
OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 1080p g-64cpu-236mem-8v100-3ssd 11 22 33 44 55 SE +/- 0.30, N = 3 48.50 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
g-64cpu-236mem-8v100-3ssd Processor: Intel Xeon (32 Cores / 64 Threads), Motherboard: Google Compute Engine n1-standard-64, Memory: 236GB, Disk: 3 x 403GB nvme_card + 137GB PersistentDisk, Graphics: Tesla V100-SXM2-16GB
OS: Ubuntu 20.04, Kernel: 5.4.0-1036-gcp (x86_64), Desktop: GNOME Shell 3.36.4, Display Server: X Server 1.20.9, Display Driver: NVIDIA, OpenCL: OpenCL 1.2 CUDA 11.2.109, Vulkan: 1.2.155, Compiler: GCC 9.3.0 + CUDA 10.1, File-System: ext4, System Layer: KVM
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vDisk Notes: NONE / relatime,rw,stripe=384 / raid0 nvme0n3[2] nvme0n2[1] nvme0n1[0] Block Size: 4096Processor Notes: CPU Microcode: 0x1Python Notes: Python 3.8.5Security Notes: itlb_multihit: Not affected + l1tf: Mitigation of PTE Inversion + mds: Mitigation of Clear buffers; SMT Host state unknown + meltdown: Mitigation of PTI + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 12 February 2021 23:22 by user sniklaus.