compulab-airtop-3-rtx-4000-compute Intel Xeon E-2288G testing with a Compulab SBC-ATCFL v1.2 (ATOP3.PRD.0.29.2 BIOS) and NVIDIA Quadro RTX 4000 8GB on Ubuntu 20.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2010311-FI-COMPULABA24&sro&grs .
compulab-airtop-3-rtx-4000-compute Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL OpenCL Vulkan Compiler File-System Screen Resolution 1 1a 2 1b 1c 1d 1e NVIDIA Quadro RTX 4000 RTX 4000 NVIDIA RTX 4000 Intel Xeon E-2288G @ 5.00GHz (8 Cores / 16 Threads) Compulab SBC-ATCFL v1.2 (ATOP3.PRD.0.29.2 BIOS) Intel Cannon Lake PCH 64GB Samsung SSD 970 EVO Plus 250GB NVIDIA Quadro RTX 4000 8GB (1005/6500MHz) Intel Cannon Lake PCH cAVS VE228 Intel I219-LM + Intel I210 Ubuntu 20.10 5.8.0-26-generic (x86_64) GNOME Shell 3.38.1 X Server 1.20.9 NVIDIA 455.28 4.6.0 OpenCL 1.2 CUDA 11.1.96 1.2.142 GCC 10.2.0 ext4 1920x1080 NVIDIA Quadro RTX 4000 8GB (300/405MHz) NVIDIA Quadro RTX 4000 8GB (1005/6500MHz) NVIDIA Quadro RTX 4000 8GB (300/405MHz) NVIDIA Quadro RTX 4000 8GB (1005/6500MHz) OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave - CPU Microcode: 0xd6 - Thermald 2.3 OpenCL Details - GPU Compute Cores: 2304 Python Details - 1, 1a, 1b, 1d, 1e, NVIDIA Quadro RTX 4000, RTX 4000, NVIDIA RTX 4000: Python 3.8.6 Security Details - itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Mitigation of TSX disabled + tsx_async_abort: Mitigation of TSX disabled
compulab-airtop-3-rtx-4000-compute realsr-ncnn: 4x - Yes clpeak: Single-Precision Float hashcat: 7-Zip hashcat: MD5 realsr-ncnn: 4x - No waifu2x-ncnn: 2x - 3 - Yes hashcat: SHA1 hashcat: TrueCrypt RIPEMD160 + XTS hashcat: SHA-512 clpeak: Integer Compute INT ncnn: Vulkan GPU - resnet50 vkfft: ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - vgg16 redshift: blender: Classroom - CUDA ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - shufflenet-v2 blender: Classroom - NVIDIA OptiX cl-mem: Write blender: Barbershop - CUDA ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU - blazeface cl-mem: Copy clpeak: Global Memory Bandwidth ncnn: Vulkan GPU - mnasnet blender: Fishy Cat - NVIDIA OptiX plaidml: No - Inference - DenseNet 201 - OpenCL blender: Pabellon Barcelona - NVIDIA OptiX blender: Barbershop - NVIDIA OptiX blender: Fishy Cat - CUDA ncnn: Vulkan GPU - efficientnet-b0 blender: BMW27 - CUDA plaidml: No - Inference - Mobilenet - OpenCL plaidml: Yes - Inference - Mobilenet - OpenCL ncnn: Vulkan GPU - squeezenet viennacl: OpenCL LU Factorization plaidml: No - Inference - IMDB LSTM - OpenCL fahbench: mandelgpu: GPU arrayfire: Conjugate Gradient OpenCL blender: Pabellon Barcelona - CUDA cl-mem: Read clpeak: Double-Precision Double financebench: Black-Scholes OpenCL ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 neatbench: GPU blender: BMW27 - NVIDIA OptiX ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - mobilenet luxcorerender-cl: Rainbow Colors and Prism luxcorerender-cl: LuxCore Benchmark luxcorerender-cl: Food luxcorerender-cl: DLSC 1 1a 2 1b 1c 1d 1e NVIDIA Quadro RTX 4000 RTX 4000 NVIDIA RTX 4000 80.445 446900 25121966667 12.552 5.502 8704500000 327233 1102266667 25694 325.5 283.0 68.2883 379.7 14.034 81.129 444300 24944700000 12.604 5.551 8642000000 326300 1095266667 25585 320.4 282.6 68.4737 379.2 14.032 81.544 442667 24838966667 12.665 5.572 8615933333 324033 1091033333 25486 321.1 282.1 68.2546 379.2 14.037 82.464 12.701 5.671 84.174 433300 24259033333 12.978 5.694 8426866667 317867 1070066667 25027 382 321.1 281.1 68.3795 191.6264 379.3 14.036 10.42 3.40 1.52 3.99 81.174 6033.10 443000 24876900000 12.580 5.571 8633500000 324267 1092200000 5712.25 3.89 25457 2.18 8.79 381 218.66 3.32 1.33 115.78 322.5 756.64 1.71 0.63 282.1 346.09 1.52 58.38 140.54 160.82 1307.24 112.74 2.73 57.80 1490.38 1843.84 3.77 68.4059 423.50 191.8417 248122412.9 2.247 459.35 379.3 259.66 14.034 1.48 31.0 32.14 8.62 1.8 4.66 10.78 3.50 1.57 4.09 85.926 6004.48 424633 23840966667 13.078 5.785 8254033333 312233 1049733333 5741.92 3.93 24684 2.21 9.14 391 224.85 3.40 1.35 118.39 318.4 771.19 1.74 0.63 278.7 340.91 1.54 59.12 138.91 160.95 1321.89 114.00 2.75 58.10 1475.98 1834.20 3.78 68.0188 421.03 190.7594 248177151.4 2.255 460.64 379.3 259.50 14.034 1.48 30.9 29.23 8.33 1.95 4.67 10.79 3.47 1.55 4.01 87.793 6536.45 417767 23506866667 13.350 5.861 8181566667 309400 1041500000 6013.59 4.08 24538 2.27 9.02 393 223.81 3.37 1.36 117.20 319.4 764.30 1.74 0.64 278.8 342.37 1.54 58.69 139.15 159.09 1307.37 113.88 2.76 58.41 1480.20 1829.24 3.80 68.0204 420.80 190.8199 246857018.0 2.257 459.08 379.3 259.33 14.034 1.48 30.2 29.14 8.22 1.77 4.86 10.73 3.47 1.55 4.02 80.549 446400 25075466667 12.494 5.540 8700533333 327567 1100433333 25593 379 323.0 282.1 68.2894 379.3 14.032 OpenBenchmarking.org
RealSR-NCNN Scale: 4x - TAA: Yes OpenBenchmarking.org Seconds, Fewer Is Better RealSR-NCNN 20200818 Scale: 4x - TAA: Yes 1 1a 1b 1c 1d 1e NVIDIA Quadro RTX 4000 NVIDIA RTX 4000 RTX 4000 20 40 60 80 100 SE +/- 0.35, N = 3 SE +/- 0.36, N = 3 SE +/- 0.40, N = 3 SE +/- 0.40, N = 3 SE +/- 0.36, N = 3 SE +/- 0.36, N = 3 SE +/- 0.45, N = 3 SE +/- 0.37, N = 3 SE +/- 0.27, N = 3 80.45 81.13 81.54 82.46 84.17 81.17 85.93 80.55 87.79
clpeak OpenCL Test: Single-Precision Float OpenBenchmarking.org GFLOPS, More Is Better clpeak OpenCL Test: Single-Precision Float 1e NVIDIA Quadro RTX 4000 RTX 4000 1400 2800 4200 5600 7000 SE +/- 35.64, N = 3 SE +/- 55.06, N = 3 SE +/- 97.61, N = 3 6033.10 6004.48 6536.45 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
Hashcat Benchmark: 7-Zip OpenBenchmarking.org H/s, More Is Better Hashcat 6.1.1 Benchmark: 7-Zip 1 1a 1b 1d 1e NVIDIA Quadro RTX 4000 NVIDIA RTX 4000 RTX 4000 100K 200K 300K 400K 500K SE +/- 208.17, N = 3 SE +/- 556.78, N = 3 SE +/- 202.76, N = 3 SE +/- 321.46, N = 3 SE +/- 1365.04, N = 3 SE +/- 463.08, N = 3 SE +/- 200.00, N = 3 SE +/- 233.33, N = 3 446900 444300 442667 433300 443000 424633 446400 417767
Hashcat Benchmark: MD5 OpenBenchmarking.org H/s, More Is Better Hashcat 6.1.1 Benchmark: MD5 1 1a 1b 1d 1e NVIDIA Quadro RTX 4000 NVIDIA RTX 4000 RTX 4000 5000M 10000M 15000M 20000M 25000M SE +/- 25031801.99, N = 3 SE +/- 2051828.45, N = 3 SE +/- 12651789.51, N = 3 SE +/- 2643440.52, N = 3 SE +/- 12698162.60, N = 3 SE +/- 13574649.58, N = 3 SE +/- 24626025.08, N = 3 SE +/- 2355372.11, N = 3 25121966667 24944700000 24838966667 24259033333 24876900000 23840966667 25075466667 23506866667
RealSR-NCNN Scale: 4x - TAA: No OpenBenchmarking.org Seconds, Fewer Is Better RealSR-NCNN 20200818 Scale: 4x - TAA: No 1 1a 1b 1c 1d 1e NVIDIA Quadro RTX 4000 NVIDIA RTX 4000 RTX 4000 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 12.55 12.60 12.67 12.70 12.98 12.58 13.08 12.49 13.35
Waifu2x-NCNN Vulkan Scale: 2x - Denoise: 3 - TAA: Yes OpenBenchmarking.org Seconds, Fewer Is Better Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: Yes 1 1a 1b 1c 1d 1e NVIDIA Quadro RTX 4000 NVIDIA RTX 4000 RTX 4000 1.3187 2.6374 3.9561 5.2748 6.5935 SE +/- 0.008, N = 3 SE +/- 0.023, N = 3 SE +/- 0.022, N = 3 SE +/- 0.047, N = 3 SE +/- 0.008, N = 3 SE +/- 0.017, N = 3 SE +/- 0.018, N = 3 SE +/- 0.041, N = 3 SE +/- 0.014, N = 3 5.502 5.551 5.572 5.671 5.694 5.571 5.785 5.540 5.861
Hashcat Benchmark: SHA1 OpenBenchmarking.org H/s, More Is Better Hashcat 6.1.1 Benchmark: SHA1 1 1a 1b 1d 1e NVIDIA Quadro RTX 4000 NVIDIA RTX 4000 RTX 4000 2000M 4000M 6000M 8000M 10000M SE +/- 9832090.32, N = 3 SE +/- 6847870.72, N = 3 SE +/- 3773739.67, N = 3 SE +/- 7846938.54, N = 3 SE +/- 6005275.46, N = 3 SE +/- 5691026.07, N = 3 SE +/- 2630800.47, N = 3 SE +/- 6590228.46, N = 3 8704500000 8642000000 8615933333 8426866667 8633500000 8254033333 8700533333 8181566667
Hashcat Benchmark: TrueCrypt RIPEMD160 + XTS OpenBenchmarking.org H/s, More Is Better Hashcat 6.1.1 Benchmark: TrueCrypt RIPEMD160 + XTS 1 1a 1b 1d 1e NVIDIA Quadro RTX 4000 NVIDIA RTX 4000 RTX 4000 70K 140K 210K 280K 350K SE +/- 533.33, N = 3 SE +/- 185.59, N = 3 SE +/- 88.19, N = 3 SE +/- 233.33, N = 3 SE +/- 317.98, N = 3 SE +/- 683.94, N = 3 327233 326300 324033 317867 324267 312233 327567 309400
Hashcat Benchmark: SHA-512 OpenBenchmarking.org H/s, More Is Better Hashcat 6.1.1 Benchmark: SHA-512 1 1a 1b 1d 1e NVIDIA Quadro RTX 4000 NVIDIA RTX 4000 RTX 4000 200M 400M 600M 800M 1000M SE +/- 819213.72, N = 3 SE +/- 491030.66, N = 3 SE +/- 643773.60, N = 3 SE +/- 1017076.42, N = 3 SE +/- 1021436.90, N = 3 SE +/- 1260070.54, N = 3 SE +/- 240370.09, N = 3 SE +/- 953939.20, N = 3 1102266667 1095266667 1091033333 1070066667 1092200000 1049733333 1100433333 1041500000
clpeak OpenCL Test: Integer Compute INT OpenBenchmarking.org GIOPS, More Is Better clpeak OpenCL Test: Integer Compute INT 1e NVIDIA Quadro RTX 4000 RTX 4000 1300 2600 3900 5200 6500 SE +/- 68.26, N = 12 SE +/- 46.13, N = 3 SE +/- 102.19, N = 3 5712.25 5741.92 6013.59 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: resnet50 1e NVIDIA Quadro RTX 4000 RTX 4000 0.918 1.836 2.754 3.672 4.59 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.14, N = 3 3.89 3.93 4.08 MIN: 3.86 / MAX: 3.99 MIN: 3.91 / MAX: 4.04 MIN: 3.92 / MAX: 40.55 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
VkFFT OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 2020-09-29 1 1a 1b 1d 1e NVIDIA Quadro RTX 4000 NVIDIA RTX 4000 RTX 4000 6K 12K 18K 24K 30K SE +/- 28.39, N = 3 SE +/- 16.51, N = 3 SE +/- 27.82, N = 3 SE +/- 17.21, N = 3 SE +/- 32.54, N = 3 SE +/- 20.11, N = 3 SE +/- 4.04, N = 3 25694 25585 25486 25027 25457 24684 25593 24538
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: alexnet 1e NVIDIA Quadro RTX 4000 RTX 4000 0.5108 1.0216 1.5324 2.0432 2.554 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 2.18 2.21 2.27 MIN: 1.91 / MAX: 11.43 MIN: 1.91 / MAX: 6.96 MIN: 2.15 / MAX: 23.82 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: vgg16 1e NVIDIA Quadro RTX 4000 RTX 4000 3 6 9 12 15 SE +/- 0.04, N = 3 SE +/- 0.13, N = 3 SE +/- 0.04, N = 3 8.79 9.14 9.02 MIN: 8.1 / MAX: 20.83 MIN: 8.49 / MAX: 36.48 MIN: 8.35 / MAX: 20.34 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
RedShift Demo OpenBenchmarking.org Seconds, Fewer Is Better RedShift Demo 3.0 1d 1e NVIDIA Quadro RTX 4000 NVIDIA RTX 4000 RTX 4000 90 180 270 360 450 SE +/- 2.60, N = 3 SE +/- 2.31, N = 3 SE +/- 4.63, N = 3 SE +/- 2.33, N = 3 SE +/- 4.91, N = 3 382 381 391 379 393
Blender Blend File: Classroom - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Classroom - Compute: CUDA 1e NVIDIA Quadro RTX 4000 RTX 4000 50 100 150 200 250 SE +/- 1.49, N = 3 SE +/- 3.62, N = 3 SE +/- 3.26, N = 3 218.66 224.85 223.81
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: googlenet 1e NVIDIA Quadro RTX 4000 RTX 4000 0.765 1.53 2.295 3.06 3.825 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 3.32 3.40 3.37 MIN: 3.29 / MAX: 3.43 MIN: 3.33 / MAX: 20.26 MIN: 3.35 / MAX: 3.44 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: shufflenet-v2 1e NVIDIA Quadro RTX 4000 RTX 4000 0.306 0.612 0.918 1.224 1.53 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.33 1.35 1.36 MIN: 1.32 / MAX: 1.4 MIN: 1.33 / MAX: 1.4 MIN: 1.34 / MAX: 1.41 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Blender Blend File: Classroom - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Classroom - Compute: NVIDIA OptiX 1e NVIDIA Quadro RTX 4000 RTX 4000 30 60 90 120 150 SE +/- 0.87, N = 3 SE +/- 0.55, N = 3 SE +/- 0.57, N = 3 115.78 118.39 117.20
cl-mem Benchmark: Write OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Write 1 1a 1b 1d 1e NVIDIA Quadro RTX 4000 NVIDIA RTX 4000 RTX 4000 70 140 210 280 350 SE +/- 1.79, N = 3 SE +/- 1.48, N = 3 SE +/- 0.78, N = 3 SE +/- 0.96, N = 3 SE +/- 0.58, N = 3 SE +/- 2.17, N = 3 SE +/- 1.47, N = 3 SE +/- 1.44, N = 3 325.5 320.4 321.1 321.1 322.5 318.4 323.0 319.4 1. (CC) gcc options: -O2 -flto -lOpenCL
Blender Blend File: Barbershop - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Barbershop - Compute: CUDA 1e NVIDIA Quadro RTX 4000 RTX 4000 170 340 510 680 850 SE +/- 2.80, N = 3 SE +/- 1.15, N = 3 SE +/- 0.87, N = 3 756.64 771.19 764.30
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 1e NVIDIA Quadro RTX 4000 RTX 4000 0.3915 0.783 1.1745 1.566 1.9575 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.71 1.74 1.74 MIN: 1.7 / MAX: 1.75 MIN: 1.73 / MAX: 1.81 MIN: 1.73 / MAX: 1.8 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: blazeface 1e NVIDIA Quadro RTX 4000 RTX 4000 0.144 0.288 0.432 0.576 0.72 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 0.63 0.63 0.64 MIN: 0.62 / MAX: 0.68 MIN: 0.62 / MAX: 0.65 MIN: 0.62 / MAX: 0.84 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
cl-mem Benchmark: Copy OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Copy 1 1a 1b 1d 1e NVIDIA Quadro RTX 4000 NVIDIA RTX 4000 RTX 4000 60 120 180 240 300 SE +/- 0.26, N = 3 SE +/- 0.06, N = 3 SE +/- 0.12, N = 3 SE +/- 0.20, N = 3 SE +/- 0.12, N = 3 SE +/- 0.20, N = 3 SE +/- 0.28, N = 3 SE +/- 0.12, N = 3 283.0 282.6 282.1 281.1 282.1 278.7 282.1 278.8 1. (CC) gcc options: -O2 -flto -lOpenCL
clpeak OpenCL Test: Global Memory Bandwidth OpenBenchmarking.org GBPS, More Is Better clpeak OpenCL Test: Global Memory Bandwidth 1e NVIDIA Quadro RTX 4000 RTX 4000 80 160 240 320 400 SE +/- 4.72, N = 3 SE +/- 4.44, N = 3 SE +/- 5.13, N = 3 346.09 340.91 342.37 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: mnasnet 1e NVIDIA Quadro RTX 4000 RTX 4000 0.3465 0.693 1.0395 1.386 1.7325 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.52 1.54 1.54 MIN: 1.5 / MAX: 1.56 MIN: 1.53 / MAX: 1.63 MIN: 1.53 / MAX: 1.63 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Blender Blend File: Fishy Cat - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Fishy Cat - Compute: NVIDIA OptiX 1e NVIDIA Quadro RTX 4000 RTX 4000 13 26 39 52 65 SE +/- 0.21, N = 3 SE +/- 0.24, N = 3 SE +/- 0.22, N = 3 58.38 59.12 58.69
PlaidML FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL 1e NVIDIA Quadro RTX 4000 RTX 4000 30 60 90 120 150 SE +/- 0.18, N = 3 SE +/- 0.11, N = 3 SE +/- 0.12, N = 3 140.54 138.91 139.15
Blender Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX 1e NVIDIA Quadro RTX 4000 RTX 4000 40 80 120 160 200 SE +/- 0.15, N = 3 SE +/- 0.11, N = 3 SE +/- 0.48, N = 3 160.82 160.95 159.09
Blender Blend File: Barbershop - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Barbershop - Compute: NVIDIA OptiX 1e NVIDIA Quadro RTX 4000 RTX 4000 300 600 900 1200 1500 SE +/- 4.68, N = 3 SE +/- 1.05, N = 3 SE +/- 0.53, N = 3 1307.24 1321.89 1307.37
Blender Blend File: Fishy Cat - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Fishy Cat - Compute: CUDA 1e NVIDIA Quadro RTX 4000 RTX 4000 30 60 90 120 150 SE +/- 0.28, N = 3 SE +/- 0.16, N = 3 SE +/- 0.21, N = 3 112.74 114.00 113.88
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: efficientnet-b0 1e NVIDIA Quadro RTX 4000 RTX 4000 0.621 1.242 1.863 2.484 3.105 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 2.73 2.75 2.76 MIN: 2.7 / MAX: 8.24 MIN: 2.74 / MAX: 3.38 MIN: 2.75 / MAX: 3.23 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Blender Blend File: BMW27 - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: BMW27 - Compute: CUDA 1e NVIDIA Quadro RTX 4000 RTX 4000 13 26 39 52 65 SE +/- 0.09, N = 3 SE +/- 0.14, N = 3 SE +/- 0.10, N = 3 57.80 58.10 58.41
PlaidML FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL 1e NVIDIA Quadro RTX 4000 RTX 4000 300 600 900 1200 1500 SE +/- 5.16, N = 3 SE +/- 5.69, N = 3 SE +/- 5.32, N = 3 1490.38 1475.98 1480.20
PlaidML FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL OpenBenchmarking.org FPS, More Is Better PlaidML FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL 1e NVIDIA Quadro RTX 4000 RTX 4000 400 800 1200 1600 2000 SE +/- 9.92, N = 3 SE +/- 2.10, N = 3 SE +/- 8.76, N = 3 1843.84 1834.20 1829.24
NCNN Target: Vulkan GPU - Model: squeezenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: squeezenet 1e NVIDIA Quadro RTX 4000 RTX 4000 0.855 1.71 2.565 3.42 4.275 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 3.77 3.78 3.80 MIN: 3.71 / MAX: 3.87 MIN: 3.72 / MAX: 3.84 MIN: 3.74 / MAX: 10.31 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
ViennaCL OpenCL LU Factorization OpenBenchmarking.org GFLOPS, More Is Better ViennaCL 1.4.2 OpenCL LU Factorization 1 1a 1b 1d 1e NVIDIA Quadro RTX 4000 NVIDIA RTX 4000 RTX 4000 15 30 45 60 75 SE +/- 0.12, N = 3 SE +/- 0.12, N = 3 SE +/- 0.28, N = 3 SE +/- 0.06, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 68.29 68.47 68.25 68.38 68.41 68.02 68.29 68.02 1. (CXX) g++ options: -rdynamic -lOpenCL
PlaidML FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL 1e NVIDIA Quadro RTX 4000 RTX 4000 90 180 270 360 450 SE +/- 0.45, N = 3 SE +/- 1.26, N = 3 SE +/- 0.32, N = 3 423.50 421.03 420.80
FAHBench OpenBenchmarking.org Ns Per Day, More Is Better FAHBench 2.3.2 1d 1e NVIDIA Quadro RTX 4000 RTX 4000 40 80 120 160 200 SE +/- 0.40, N = 3 SE +/- 0.32, N = 3 SE +/- 0.27, N = 3 SE +/- 0.37, N = 3 191.63 191.84 190.76 190.82
MandelGPU OpenCL Device: GPU OpenBenchmarking.org Samples/sec, More Is Better MandelGPU 1.3pts1 OpenCL Device: GPU 1e NVIDIA Quadro RTX 4000 RTX 4000 50M 100M 150M 200M 250M SE +/- 711502.39, N = 3 SE +/- 308768.05, N = 3 SE +/- 540319.59, N = 3 248122412.9 248177151.4 246857018.0 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
ArrayFire Test: Conjugate Gradient OpenCL OpenBenchmarking.org ms, Fewer Is Better ArrayFire 3.7 Test: Conjugate Gradient OpenCL 1e NVIDIA Quadro RTX 4000 RTX 4000 0.5078 1.0156 1.5234 2.0312 2.539 SE +/- 0.008, N = 3 SE +/- 0.012, N = 3 SE +/- 0.007, N = 3 2.247 2.255 2.257 1. (CXX) g++ options: -rdynamic
Blender Blend File: Pabellon Barcelona - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Pabellon Barcelona - Compute: CUDA 1e NVIDIA Quadro RTX 4000 RTX 4000 100 200 300 400 500 SE +/- 1.17, N = 3 SE +/- 1.47, N = 3 SE +/- 0.47, N = 3 459.35 460.64 459.08
cl-mem Benchmark: Read OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Read 1 1a 1b 1d 1e NVIDIA Quadro RTX 4000 NVIDIA RTX 4000 RTX 4000 80 160 240 320 400 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 379.7 379.2 379.2 379.3 379.3 379.3 379.3 379.3 1. (CC) gcc options: -O2 -flto -lOpenCL
clpeak OpenCL Test: Double-Precision Double OpenBenchmarking.org GFLOPS, More Is Better clpeak OpenCL Test: Double-Precision Double 1e NVIDIA Quadro RTX 4000 RTX 4000 60 120 180 240 300 SE +/- 0.31, N = 3 SE +/- 0.18, N = 3 SE +/- 0.02, N = 3 259.66 259.50 259.33 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
FinanceBench Benchmark: Black-Scholes OpenCL OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-06-06 Benchmark: Black-Scholes OpenCL 1 1a 1b 1d 1e NVIDIA Quadro RTX 4000 NVIDIA RTX 4000 RTX 4000 4 8 12 16 20 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 14.03 14.03 14.04 14.04 14.03 14.03 14.03 14.03 1. (CXX) g++ options: -O3 -lOpenCL
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 1e NVIDIA Quadro RTX 4000 RTX 4000 0.333 0.666 0.999 1.332 1.665 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.48 1.48 1.48 MIN: 1.44 / MAX: 20.23 MIN: 1.46 / MAX: 1.5 MIN: 1.47 / MAX: 1.54 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NeatBench Acceleration: GPU OpenBenchmarking.org FPS, More Is Better NeatBench 5 Acceleration: GPU 1e NVIDIA Quadro RTX 4000 RTX 4000 7 14 21 28 35 SE +/- 0.66, N = 15 SE +/- 0.69, N = 15 SE +/- 0.63, N = 15 31.0 30.9 30.2
Blender Blend File: BMW27 - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: BMW27 - Compute: NVIDIA OptiX 1e NVIDIA Quadro RTX 4000 RTX 4000 7 14 21 28 35 SE +/- 3.24, N = 15 SE +/- 0.09, N = 3 SE +/- 0.06, N = 3 32.14 29.23 29.14
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: yolov4-tiny 1e NVIDIA Quadro RTX 4000 RTX 4000 2 4 6 8 10 SE +/- 0.37, N = 3 SE +/- 0.09, N = 3 SE +/- 0.00, N = 3 8.62 8.33 8.22 MIN: 8.1 / MAX: 74.77 MIN: 8.13 / MAX: 55.28 MIN: 8.15 / MAX: 8.57 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: resnet18 1e NVIDIA Quadro RTX 4000 RTX 4000 0.4388 0.8776 1.3164 1.7552 2.194 SE +/- 0.05, N = 2 SE +/- 0.13, N = 3 SE +/- 0.04, N = 3 1.80 1.95 1.77 MIN: 1.69 / MAX: 21.82 MIN: 1.7 / MAX: 20.49 MIN: 1.71 / MAX: 24.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: mobilenet 1e NVIDIA Quadro RTX 4000 RTX 4000 1.0935 2.187 3.2805 4.374 5.4675 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.17, N = 3 4.66 4.67 4.86 MIN: 4.6 / MAX: 4.86 MIN: 4.64 / MAX: 4.75 MIN: 4.64 / MAX: 71.23 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
LuxCoreRender OpenCL Scene: Rainbow Colors and Prism OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender OpenCL 2.3 Scene: Rainbow Colors and Prism 1d 1e NVIDIA Quadro RTX 4000 RTX 4000 3 6 9 12 15 SE +/- 0.34, N = 12 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 SE +/- 0.02, N = 3 10.42 10.78 10.79 10.73 MIN: 3.45 / MAX: 11.19 MIN: 10.09 / MAX: 11.23 MIN: 10.45 / MAX: 11.21 MIN: 9.75 / MAX: 11.24
LuxCoreRender OpenCL Scene: LuxCore Benchmark OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender OpenCL 2.3 Scene: LuxCore Benchmark 1d 1e NVIDIA Quadro RTX 4000 RTX 4000 0.7875 1.575 2.3625 3.15 3.9375 SE +/- 0.07, N = 12 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 3.40 3.50 3.47 3.47 MIN: 0.17 / MAX: 3.97 MIN: 0.27 / MAX: 4 MIN: 0.27 / MAX: 3.96 MIN: 0.33 / MAX: 3.96
LuxCoreRender OpenCL Scene: Food OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender OpenCL 2.3 Scene: Food 1d 1e NVIDIA Quadro RTX 4000 RTX 4000 0.3533 0.7066 1.0599 1.4132 1.7665 SE +/- 0.04, N = 12 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 1.52 1.57 1.55 1.55 MIN: 0.14 / MAX: 1.88 MIN: 0.26 / MAX: 1.89 MIN: 0.25 / MAX: 1.85 MIN: 0.26 / MAX: 1.86
LuxCoreRender OpenCL Scene: DLSC OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender OpenCL 2.3 Scene: DLSC 1d 1e NVIDIA Quadro RTX 4000 RTX 4000 0.9203 1.8406 2.7609 3.6812 4.6015 SE +/- 0.08, N = 12 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 3.99 4.09 4.01 4.02 MIN: 1.12 / MAX: 4.22 MIN: 3.82 / MAX: 4.25 MIN: 3.83 / MAX: 4.21 MIN: 3.82 / MAX: 4.2
Phoronix Test Suite v10.8.4