s-29cpu-242mem-4v100 Processor: Intel Xeon E5-2686 v4 @ 3.00GHz (16 Cores / 32 Threads), Motherboard: Xen HVM domU (4.11.amazon BIOS), Memory: 242GB, Disk: 1968GB, Graphics: Tesla V100-SXM2-16GB
OS: Ubuntu 18.04, Kernel: 4.19.128-flatcar (x86_64), Display Driver: NVIDIA, Compiler: GCC 7.5.0 + CUDA 10.1, File-System: ext4, System Layer: docker Xen HVM domU 4.11.amazon
Kernel Notes: Transparent Huge Pages: alwaysCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic --without-cuda-driver -vDisk Notes: MQ-DEADLINE / relatime,rw,seclabel / Block Size: 4096Processor Notes: Scaling Governor: intel_pstate performance - CPU Microcode: 0xb000038Python Notes: Python 3.6.9 :: AnacondaSecurity Notes: itlb_multihit: KVM: Vulnerable + l1tf: Mitigation of PTE Inversion + mds: Vulnerable: Clear buffers attempted no microcode; SMT Host state unknown + meltdown: Mitigation of PTI + spec_store_bypass: Vulnerable + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline STIBP: disabled RSB filling + srbds: Not affected + tsx_async_abort: Vulnerable: Clear buffers attempted no microcode; SMT Host state unknown
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: googlenet s-29cpu-242mem-4v100 6 12 18 24 30 SE +/- 0.37, N = 15 26.83 MIN: 19.69 / MAX: 441.1 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test measures the RSA 4096-bit performance of OpenSSL. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Signs Per Second, More Is Better OpenSSL 1.1.1 RSA 4096-bit Performance s-29cpu-242mem-4v100 600 1200 1800 2400 3000 SE +/- 2.87, N = 3 2799.2 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
POV-Ray This is a test of POV-Ray, the Persistence of Vision Raytracer. POV-Ray is used to create 3D graphics using ray-tracing. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better POV-Ray 3.7.0.7 Trace Time s-29cpu-242mem-4v100 11 22 33 44 55 SE +/- 0.09, N = 3 48.56 1. (CXX) g++ options: -pipe -O3 -ffast-math -march=native -pthread -lXpm -lSM -lICE -lX11 -ltiff -ljpeg -lpng -lz -lrt -lm -lboost_thread -lboost_system
OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 4K s-29cpu-242mem-4v100 3 6 9 12 15 SE +/- 0.08, N = 3 13.12 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Kvazaar This is a test of Kvazaar as a CPU-based H.265 video encoder written in the C programming language and optimized in Assembly. Kvazaar is the winner of the 2016 ACM Open-Source Software Competition and developed at the Ultra Video Group, Tampere University, Finland. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast s-29cpu-242mem-4v100 15 30 45 60 75 SE +/- 0.17, N = 3 68.84 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Very Fast s-29cpu-242mem-4v100 10 20 30 40 50 SE +/- 0.08, N = 3 42.62 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Ultra Fast s-29cpu-242mem-4v100 5 10 15 20 25 SE +/- 0.01, N = 3 18.97 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Very Fast s-29cpu-242mem-4v100 3 6 9 12 15 SE +/- 0.01, N = 3 12.14 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Medium s-29cpu-242mem-4v100 5 10 15 20 25 SE +/- 0.04, N = 3 18.82 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Slow s-29cpu-242mem-4v100 4 8 12 16 20 SE +/- 0.03, N = 3 18.11 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Medium s-29cpu-242mem-4v100 1.089 2.178 3.267 4.356 5.445 SE +/- 0.01, N = 3 4.84 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Slow s-29cpu-242mem-4v100 1.0575 2.115 3.1725 4.23 5.2875 SE +/- 0.00, N = 3 4.70 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org MB/s, More Is Better CacheBench Read Cache s-29cpu-242mem-4v100 500 1000 1500 2000 2500 SE +/- 1.25, N = 3 2259.26 MIN: 2229.7 / MAX: 2266.1 1. (CC) gcc options: -lrt
NAMD NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD was developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org days/ns, Fewer Is Better NAMD 2.13b1 ATPase Simulation - 327,506 Atoms s-29cpu-242mem-4v100 0.3757 0.7514 1.1271 1.5028 1.8785 SE +/- 0.00701, N = 3 1.66994
Rodinia Rodinia is a suite focused upon accelerating compute-intensive applications with accelerators. CUDA, OpenMP, and OpenCL parallel models are supported by the included applications. This profile utilizes the OpenCL and OpenMP test binaries at the moment. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 2.4 Test: OpenMP CFD Solver s-29cpu-242mem-4v100 6 12 18 24 30 SE +/- 0.31, N = 3 23.54 1. (CXX) g++ options: -O2 -lOpenCL
OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 2.4 Test: OpenMP LavaMD s-29cpu-242mem-4v100 16 32 48 64 80 SE +/- 0.11, N = 3 73.83 1. (CXX) g++ options: -O2 -lOpenCL
OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy - Array Size: 1024 MiB s-29cpu-242mem-4v100 1600 3200 4800 6400 8000 SE +/- 79.68, N = 3 7632.98 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Triad - Benchmark: Floating Point s-29cpu-242mem-4v100 4K 8K 12K 16K 20K SE +/- 37.36, N = 3 18229.61 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Scale - Benchmark: Floating Point s-29cpu-242mem-4v100 3K 6K 9K 12K 15K SE +/- 111.29, N = 3 16327.93 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Copy - Benchmark: Floating Point s-29cpu-242mem-4v100 4K 8K 12K 16K 20K SE +/- 116.20, N = 3 16533.57 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Add - Benchmark: Floating Point s-29cpu-242mem-4v100 4K 8K 12K 16K 20K SE +/- 70.74, N = 3 18359.43 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Average - Benchmark: Integer s-29cpu-242mem-4v100 4K 8K 12K 16K 20K SE +/- 55.71, N = 3 17291.45 1. (CC) gcc options: -O3 -march=native
PostMark This is a test of NetApp's PostMark benchmark designed to simulate small-file testing similar to the tasks endured by web and mail servers. This test profile will set PostMark to perform 25,000 transactions with 500 files simultaneously with the file sizes ranging between 5 and 512 kilobytes. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org TPS, More Is Better PostMark 1.51 Disk Transaction Performance s-29cpu-242mem-4v100 700 1400 2100 2800 3500 3246 1. (CC) gcc options: -O3
Compile Bench Compilebench tries to age a filesystem by simulating some of the disk IO common in creating, compiling, patching, stating and reading kernel trees. It indirectly measures how well filesystems can maintain directory locality as the disk fills up and directories age. This current test is setup to use the makej mode with 10 initial directories Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better Compile Bench 0.6 Test: Read Compiled Tree s-29cpu-242mem-4v100 300 600 900 1200 1500 SE +/- 22.63, N = 3 1419.76
IOR IOR is a parallel I/O storage benchmark making use of MPI with a particular focus on HPC (High Performance Computing) systems. IOR is developed at the Lawrence Livermore National Laboratory (LLNL). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 1024MB - Disk Target: Default Test Directory s-29cpu-242mem-4v100 50 100 150 200 250 SE +/- 0.15, N = 3 246.85 MIN: 233.25 / MAX: 254.08 1. (CC) gcc options: -O2 -lm -pthread -lmpi
OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 512MB - Disk Target: Default Test Directory s-29cpu-242mem-4v100 60 120 180 240 300 SE +/- 0.16, N = 3 254.26 MIN: 242.91 / MAX: 258.14 1. (CC) gcc options: -O2 -lm -pthread -lmpi
OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 256MB - Disk Target: Default Test Directory s-29cpu-242mem-4v100 60 120 180 240 300 SE +/- 0.04, N = 3 259.88 MIN: 250.16 / MAX: 265.38 1. (CC) gcc options: -O2 -lm -pthread -lmpi
OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 64MB - Disk Target: Default Test Directory s-29cpu-242mem-4v100 60 120 180 240 300 SE +/- 0.13, N = 3 266.50 MIN: 258.56 / MAX: 319.79 1. (CC) gcc options: -O2 -lm -pthread -lmpi
OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 32MB - Disk Target: Default Test Directory s-29cpu-242mem-4v100 60 120 180 240 300 SE +/- 0.24, N = 3 267.97 MIN: 250.71 / MAX: 436.44 1. (CC) gcc options: -O2 -lm -pthread -lmpi
OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 16MB - Disk Target: Default Test Directory s-29cpu-242mem-4v100 60 120 180 240 300 SE +/- 0.08, N = 3 270.54 MIN: 217.34 / MAX: 491.68 1. (CC) gcc options: -O2 -lm -pthread -lmpi
OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 8MB - Disk Target: Default Test Directory s-29cpu-242mem-4v100 60 120 180 240 300 SE +/- 0.89, N = 3 275.14 MIN: 223.76 / MAX: 690.39 1. (CC) gcc options: -O2 -lm -pthread -lmpi
OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 4MB - Disk Target: Default Test Directory s-29cpu-242mem-4v100 60 120 180 240 300 SE +/- 2.20, N = 15 275.69 MIN: 116.2 / MAX: 594.3 1. (CC) gcc options: -O2 -lm -pthread -lmpi
OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 2MB - Disk Target: Default Test Directory s-29cpu-242mem-4v100 40 80 120 160 200 SE +/- 2.23, N = 15 179.89 MIN: 77.15 / MAX: 412.77 1. (CC) gcc options: -O2 -lm -pthread -lmpi
OpenBenchmarking.org MB/s, More Is Better Dbench 4.0 12 Clients s-29cpu-242mem-4v100 200 400 600 800 1000 SE +/- 3.20, N = 3 899.65 1. (CC) gcc options: -lpopt -O2
OpenBenchmarking.org Files/s, More Is Better FS-Mark 3.3 Test: 4000 Files, 32 Sub Dirs, 1MB Size s-29cpu-242mem-4v100 40 80 120 160 200 SE +/- 1.33, N = 3 167.4 1. (CC) gcc options: -static
OpenBenchmarking.org Files/s, More Is Better FS-Mark 3.3 Test: 5000 Files, 1MB Size, 4 Threads s-29cpu-242mem-4v100 60 120 180 240 300 SE +/- 0.10, N = 3 255.3 1. (CC) gcc options: -static
OpenBenchmarking.org Files/s, More Is Better FS-Mark 3.3 Test: 1000 Files, 1MB Size s-29cpu-242mem-4v100 40 80 120 160 200 SE +/- 1.45, N = 15 162.3 1. (CC) gcc options: -static
Flexible IO Tester FIO, the Flexible I/O Tester, is an advanced Linux disk benchmark supporting multiple I/O engines and a wealth of options. FIO was written by Jens Axboe for testing of the Linux I/O subsystem and schedulers. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org IOPS, More Is Better Flexible IO Tester 3.25 Type: Sequential Write - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 4KB - Disk Target: Default Test Directory s-29cpu-242mem-4v100 5K 10K 15K 20K 25K SE +/- 176.53, N = 15 23780 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
OpenBenchmarking.org MB/s, More Is Better Flexible IO Tester 3.25 Type: Sequential Write - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 4KB - Disk Target: Default Test Directory s-29cpu-242mem-4v100 20 40 60 80 100 SE +/- 0.68, N = 15 92.9 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
OpenBenchmarking.org IOPS, More Is Better Flexible IO Tester 3.25 Type: Sequential Write - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 2MB - Disk Target: Default Test Directory s-29cpu-242mem-4v100 30 60 90 120 150 127 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
OpenBenchmarking.org MB/s, More Is Better Flexible IO Tester 3.25 Type: Sequential Write - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 2MB - Disk Target: Default Test Directory s-29cpu-242mem-4v100 60 120 180 240 300 261 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
OpenBenchmarking.org IOPS, More Is Better Flexible IO Tester 3.25 Type: Sequential Read - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 2MB - Disk Target: Default Test Directory s-29cpu-242mem-4v100 30 60 90 120 150 127 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
OpenBenchmarking.org MB/s, More Is Better Flexible IO Tester 3.25 Type: Sequential Read - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 2MB - Disk Target: Default Test Directory s-29cpu-242mem-4v100 60 120 180 240 300 261 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
OpenBenchmarking.org IOPS, More Is Better Flexible IO Tester 3.25 Type: Random Write - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 4KB - Disk Target: Default Test Directory s-29cpu-242mem-4v100 1300 2600 3900 5200 6500 6115 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
OpenBenchmarking.org MB/s, More Is Better Flexible IO Tester 3.25 Type: Random Write - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 4KB - Disk Target: Default Test Directory s-29cpu-242mem-4v100 6 12 18 24 30 SE +/- 0.00, N = 3 23.9 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
OpenBenchmarking.org IOPS, More Is Better Flexible IO Tester 3.25 Type: Random Write - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 2MB - Disk Target: Default Test Directory s-29cpu-242mem-4v100 30 60 90 120 150 127 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
OpenBenchmarking.org MB/s, More Is Better Flexible IO Tester 3.25 Type: Random Write - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 2MB - Disk Target: Default Test Directory s-29cpu-242mem-4v100 60 120 180 240 300 261 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
OpenBenchmarking.org IOPS, More Is Better Flexible IO Tester 3.25 Type: Random Read - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 4KB - Disk Target: Default Test Directory s-29cpu-242mem-4v100 1300 2600 3900 5200 6500 6118 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
OpenBenchmarking.org MB/s, More Is Better Flexible IO Tester 3.25 Type: Random Read - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 4KB - Disk Target: Default Test Directory s-29cpu-242mem-4v100 6 12 18 24 30 SE +/- 0.00, N = 3 23.9 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
OpenBenchmarking.org IOPS, More Is Better Flexible IO Tester 3.25 Type: Random Read - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 2MB - Disk Target: Default Test Directory s-29cpu-242mem-4v100 30 60 90 120 150 127 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
OpenBenchmarking.org MB/s, More Is Better Flexible IO Tester 3.25 Type: Random Read - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 2MB - Disk Target: Default Test Directory s-29cpu-242mem-4v100 60 120 180 240 300 261 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: squeezenet_ssd s-29cpu-242mem-4v100 8 16 24 32 40 SE +/- 0.84, N = 15 35.14 MIN: 24.88 / MAX: 489.38 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: resnet50 s-29cpu-242mem-4v100 10 20 30 40 50 SE +/- 1.25, N = 15 41.77 MIN: 28.15 / MAX: 317.33 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: alexnet s-29cpu-242mem-4v100 4 8 12 16 20 SE +/- 0.55, N = 15 18.09 MIN: 11.76 / MAX: 228.64 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: resnet18 s-29cpu-242mem-4v100 6 12 18 24 30 SE +/- 3.47, N = 15 25.29 MIN: 15.53 / MAX: 703.39 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: vgg16 s-29cpu-242mem-4v100 16 32 48 64 80 SE +/- 4.61, N = 15 70.33 MIN: 45.29 / MAX: 543.47 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: blazeface s-29cpu-242mem-4v100 1.2893 2.5786 3.8679 5.1572 6.4465 SE +/- 0.17, N = 15 5.73 MIN: 3.84 / MAX: 156.31 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: efficientnet-b0 s-29cpu-242mem-4v100 4 8 12 16 20 SE +/- 0.52, N = 15 15.99 MIN: 10.04 / MAX: 420.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: mnasnet s-29cpu-242mem-4v100 3 6 9 12 15 SE +/- 0.39, N = 15 11.06 MIN: 7.21 / MAX: 419.03 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: shufflenet-v2 s-29cpu-242mem-4v100 3 6 9 12 15 SE +/- 0.34, N = 15 11.71 MIN: 7.81 / MAX: 456.48 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 s-29cpu-242mem-4v100 3 6 9 12 15 SE +/- 0.22, N = 15 10.37 MIN: 6.79 / MAX: 276.81 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 s-29cpu-242mem-4v100 3 6 9 12 15 SE +/- 0.22, N = 15 11.96 MIN: 8.02 / MAX: 224.73 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: mobilenet s-29cpu-242mem-4v100 8 16 24 32 40 SE +/- 0.55, N = 15 33.29 MIN: 23.8 / MAX: 422.32 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Flexible IO Tester FIO, the Flexible I/O Tester, is an advanced Linux disk benchmark supporting multiple I/O engines and a wealth of options. FIO was written by Jens Axboe for testing of the Linux I/O subsystem and schedulers. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org IOPS, More Is Better Flexible IO Tester 3.25 Type: Sequential Read - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 4KB - Disk Target: Default Test Directory s-29cpu-242mem-4v100 6K 12K 18K 24K 30K SE +/- 479.88, N = 15 28900 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
OpenBenchmarking.org MB/s, More Is Better Flexible IO Tester 3.25 Type: Sequential Read - IO Engine: Linux AIO - Buffered: No - Direct: Yes - Block Size: 4KB - Disk Target: Default Test Directory s-29cpu-242mem-4v100 30 60 90 120 150 SE +/- 1.76, N = 15 113 1. (CC) gcc options: -rdynamic -ll -lnuma -lrt -lz -lpthread -lm -ldl -laio -std=gnu99 -ffast-math -include -O3 -fcommon -U_FORTIFY_SOURCE -march=native
s-29cpu-242mem-4v100 Processor: Intel Xeon E5-2686 v4 @ 3.00GHz (16 Cores / 32 Threads), Motherboard: Xen HVM domU (4.11.amazon BIOS), Memory: 242GB, Disk: 1968GB, Graphics: Tesla V100-SXM2-16GB
OS: Ubuntu 18.04, Kernel: 4.19.128-flatcar (x86_64), Display Driver: NVIDIA, Compiler: GCC 7.5.0 + CUDA 10.1, File-System: ext4, System Layer: docker Xen HVM domU 4.11.amazon
Kernel Notes: Transparent Huge Pages: alwaysCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic --without-cuda-driver -vDisk Notes: MQ-DEADLINE / relatime,rw,seclabel / Block Size: 4096Processor Notes: Scaling Governor: intel_pstate performance - CPU Microcode: 0xb000038Python Notes: Python 3.6.9 :: AnacondaSecurity Notes: itlb_multihit: KVM: Vulnerable + l1tf: Mitigation of PTE Inversion + mds: Vulnerable: Clear buffers attempted no microcode; SMT Host state unknown + meltdown: Mitigation of PTI + spec_store_bypass: Vulnerable + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline STIBP: disabled RSB filling + srbds: Not affected + tsx_async_abort: Vulnerable: Clear buffers attempted no microcode; SMT Host state unknown
Testing initiated at 6 February 2021 16:32 by user sniklaus.