NGC AMD Ryzen 9 3950X 16-Core testing with a ASUS ROG CROSSHAIR VIII HERO (WI-FI) (1302 BIOS) and Zotac NVIDIA GeForce GTX 1070 Ti 8GB on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2009179-FI-NGC40795300&grs .
NGC Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL OpenCL Vulkan Compiler File-System Screen Resolution GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti AMD Ryzen 9 3950X 16-Core @ 3.50GHz (16 Cores / 32 Threads) ASUS ROG CROSSHAIR VIII HERO (WI-FI) (1302 BIOS) AMD Starship/Matisse 16GB 2000GB Corsair Force MP600 + 2000GB NVIDIA GeForce GTX 980 4GB (1126/3505MHz) NVIDIA GM204 HD Audio DELL P2415Q Realtek RTL8125 2.5GbE + Intel I211 + Intel Wi-Fi 6 AX200 Ubuntu 20.04 5.4.0-47-generic (x86_64) GNOME Shell 3.36.4 X Server 1.20.8 NVIDIA 450.66 4.6.0 OpenCL 1.2 CUDA 11.0.228 + OpenCL 2.0 AMD-APP (3182.0) 1.2.133 GCC 9.3.0 + CUDA 11.0 ext4 3840x2160 eVGA NVIDIA GeForce GTX 970 4GB (1163/3505MHz) NVIDIA GeForce GTX 980 Ti 6GB (999/3505MHz) NVIDIA GM200 HD Audio Zotac NVIDIA GeForce GTX 1070 Ti 8GB (139/4006MHz) NVIDIA GP104 HD Audio OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - GTX 980: Scaling Governor: acpi-cpufreq performance - CPU Microcode: 0x8701013 - GTX 970: Scaling Governor: acpi-cpufreq ondemand - CPU Microcode: 0x8701013 - GTX 980 Ti: Scaling Governor: acpi-cpufreq performance - CPU Microcode: 0x8701013 - GTX 1070 Ti: Scaling Governor: acpi-cpufreq ondemand - CPU Microcode: 0x8701013 OpenCL Details - GTX 980: GPU Compute Cores: 2048 - GTX 970: GPU Compute Cores: 1664 - GTX 980 Ti: GPU Compute Cores: 2816 - GTX 1070 Ti: GPU Compute Cores: 2432 Python Details - Python 3.8.2 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
NGC financebench: Black-Scholes OpenCL octanebench: Total Score blender: Barbershop - CUDA plaidml: No - Inference - IMDB LSTM - OpenCL clpeak: Global Memory Bandwidth cl-mem: Write plaidml: Yes - Inference - Mobilenet - OpenCL blender: Pabellon Barcelona - NVIDIA OptiX arrayfire: Conjugate Gradient OpenCL plaidml: No - Training - Mobilenet - OpenCL clpeak: Single-Precision Float blender: Fishy Cat - NVIDIA OptiX glmark2: 3840 x 2160 clpeak: Double-Precision Double clpeak: Integer Compute INT glmark2: 2560 x 1440 mandelgpu: GPU glmark2: 1920 x 1200 glmark2: 1920 x 1080 glmark2: 1600 x 1200 blender: Pabellon Barcelona - CUDA glmark2: 1280 x 1024 fahbench: glmark2: 1024 x 768 glmark2: 800 x 600 neatbench: GPU blender: Classroom - NVIDIA OptiX blender: BMW27 - NVIDIA OptiX blender: Fishy Cat - CUDA blender: Classroom - CUDA blender: BMW27 - CUDA plaidml: No - Inference - DenseNet 201 - OpenCL plaidml: No - Inference - Mobilenet - OpenCL namd-cuda: ATPase Simulation - 327,506 Atoms rodinia: OpenCL Particle Filter lczero: OpenCL redshift: cl-mem: Read cl-mem: Copy viennacl: OpenCL LU Factorization gromacs-gpu: Water Benchmark GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 12.019 110.594254 1415.17 228.97 165.23 157 1053.40 858.55 4.883 80.43 4448.81 319.37 159.64 1303.29 126002111.2 918.07 101.8257 17.2 423.50 151.49 328.65 421.77 165.20 82.06 810.31 0.32163 11.729 6275 1002 165.4 145.0 63.1567 3.288 14.198 96.072862 1540.95 202.94 144.51 134.0 909.50 1031.90 5.403 73.01 3877.42 369.54 2057 137.81 1145.39 3805 112885853.7 5185 5601 5688 1012.28 7007 90.9439 9148 11325 15.4 481.80 175.44 351.06 444.58 173.01 73.69 717.50 0.35160 13.049 4862 1119 144.6 125.8 58.6527 3.001 10.061 143.092739 1193.54 266.29 264.34 244.2 1281.97 690.29 3.623 89.44 5556.00 258.49 2933 196.28 1615.04 5293 152063380.5 6916 7363 7472 780.37 8967 115.4876 11254 12886 342.76 123.07 292.20 354.53 149.21 99.51 1030.27 0.28182 10.095 8291 764 266.0 217.5 64.9328 3.881 79.121999 24.871588 3430.14 387.56 194.4 1607.15 5.115 104.88 2435.50 460.10 931.55 880.20 200.04 88.37 875.35 0.33292 11.525 4631 2530 178.6 107.8 40.1134 0.958 OpenBenchmarking.org
FinanceBench Benchmark: Black-Scholes OpenCL OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-06-06 Benchmark: Black-Scholes OpenCL GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 20 40 60 80 100 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.44, N = 3 12.02 14.20 10.06 79.12 1. (CXX) g++ options: -O3 -lOpenCL
OctaneBench Total Score OpenBenchmarking.org Score, More Is Better OctaneBench 4.00c Total Score GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 30 60 90 120 150 110.59 96.07 143.09 24.87
Blender Blend File: Barbershop - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Barbershop - Compute: CUDA GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 700 1400 2100 2800 3500 SE +/- 0.23, N = 3 SE +/- 0.20, N = 3 SE +/- 0.91, N = 3 1415.17 1540.95 1193.54 3430.14
PlaidML FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 80 160 240 320 400 SE +/- 0.16, N = 3 SE +/- 0.01, N = 3 SE +/- 0.11, N = 3 SE +/- 0.12, N = 3 228.97 202.94 266.29 387.56
clpeak OpenCL Test: Global Memory Bandwidth OpenBenchmarking.org GBPS, More Is Better clpeak OpenCL Test: Global Memory Bandwidth GTX 980 GTX 970 GTX 980 Ti 60 120 180 240 300 SE +/- 0.10, N = 3 SE +/- 0.04, N = 3 SE +/- 0.41, N = 3 165.23 144.51 264.34 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
cl-mem Benchmark: Write OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Write GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 50 100 150 200 250 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 SE +/- 0.92, N = 3 157.0 134.0 244.2 194.4 1. (CC) gcc options: -O2 -flto -lOpenCL
PlaidML FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL OpenBenchmarking.org FPS, More Is Better PlaidML FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 300 600 900 1200 1500 SE +/- 1.60, N = 3 SE +/- 1.17, N = 3 SE +/- 0.94, N = 3 SE +/- 2.45, N = 3 1053.40 909.50 1281.97 1607.15
Blender Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX GTX 980 GTX 970 GTX 980 Ti 200 400 600 800 1000 SE +/- 0.33, N = 3 SE +/- 0.90, N = 3 SE +/- 0.36, N = 3 858.55 1031.90 690.29
ArrayFire Test: Conjugate Gradient OpenCL OpenBenchmarking.org ms, Fewer Is Better ArrayFire 3.7 Test: Conjugate Gradient OpenCL GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 1.2157 2.4314 3.6471 4.8628 6.0785 SE +/- 0.003, N = 3 SE +/- 0.004, N = 3 SE +/- 0.004, N = 3 SE +/- 0.062, N = 3 4.883 5.403 3.623 5.115 1. (CXX) g++ options: -rdynamic
PlaidML FP16: No - Mode: Training - Network: Mobilenet - Device: OpenCL OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Training - Network: Mobilenet - Device: OpenCL GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 20 40 60 80 100 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.12, N = 3 SE +/- 0.03, N = 3 80.43 73.01 89.44 104.88
clpeak OpenCL Test: Single-Precision Float OpenBenchmarking.org GFLOPS, More Is Better clpeak OpenCL Test: Single-Precision Float GTX 980 GTX 970 GTX 980 Ti 1200 2400 3600 4800 6000 SE +/- 67.71, N = 3 SE +/- 43.66, N = 15 SE +/- 20.25, N = 3 4448.81 3877.42 5556.00 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
Blender Blend File: Fishy Cat - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Fishy Cat - Compute: NVIDIA OptiX GTX 980 GTX 970 GTX 980 Ti 80 160 240 320 400 SE +/- 0.12, N = 3 SE +/- 0.03, N = 3 SE +/- 0.09, N = 3 319.37 369.54 258.49
GLmark2 Resolution: 3840 x 2160 OpenBenchmarking.org Score, More Is Better GLmark2 2020.04 Resolution: 3840 x 2160 GTX 970 GTX 980 Ti 600 1200 1800 2400 3000 2057 2933
clpeak OpenCL Test: Double-Precision Double OpenBenchmarking.org GFLOPS, More Is Better clpeak OpenCL Test: Double-Precision Double GTX 980 GTX 970 GTX 980 Ti 40 80 120 160 200 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 159.64 137.81 196.28 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
clpeak OpenCL Test: Integer Compute INT OpenBenchmarking.org GIOPS, More Is Better clpeak OpenCL Test: Integer Compute INT GTX 980 GTX 970 GTX 980 Ti 300 600 900 1200 1500 SE +/- 3.33, N = 3 SE +/- 14.81, N = 5 SE +/- 11.49, N = 3 1303.29 1145.39 1615.04 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
GLmark2 Resolution: 2560 x 1440 OpenBenchmarking.org Score, More Is Better GLmark2 2020.04 Resolution: 2560 x 1440 GTX 970 GTX 980 Ti 1100 2200 3300 4400 5500 3805 5293
MandelGPU OpenCL Device: GPU OpenBenchmarking.org Samples/sec, More Is Better MandelGPU 1.3pts1 OpenCL Device: GPU GTX 980 GTX 970 GTX 980 Ti 30M 60M 90M 120M 150M SE +/- 172505.04, N = 3 SE +/- 116092.47, N = 3 SE +/- 367945.59, N = 3 126002111.2 112885853.7 152063380.5 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
GLmark2 Resolution: 1920 x 1200 OpenBenchmarking.org Score, More Is Better GLmark2 2020.04 Resolution: 1920 x 1200 GTX 970 GTX 980 Ti 1500 3000 4500 6000 7500 5185 6916
GLmark2 Resolution: 1920 x 1080 OpenBenchmarking.org Score, More Is Better GLmark2 2020.04 Resolution: 1920 x 1080 GTX 970 GTX 980 Ti 1600 3200 4800 6400 8000 5601 7363
GLmark2 Resolution: 1600 x 1200 OpenBenchmarking.org Score, More Is Better GLmark2 2020.04 Resolution: 1600 x 1200 GTX 970 GTX 980 Ti 1600 3200 4800 6400 8000 5688 7472
Blender Blend File: Pabellon Barcelona - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Pabellon Barcelona - Compute: CUDA GTX 980 GTX 970 GTX 980 Ti 200 400 600 800 1000 SE +/- 0.13, N = 3 SE +/- 12.17, N = 3 SE +/- 0.09, N = 3 918.07 1012.28 780.37
GLmark2 Resolution: 1280 x 1024 OpenBenchmarking.org Score, More Is Better GLmark2 2020.04 Resolution: 1280 x 1024 GTX 970 GTX 980 Ti 2K 4K 6K 8K 10K 7007 8967
FAHBench OpenBenchmarking.org Ns Per Day, More Is Better FAHBench 2.3.2 GTX 980 GTX 970 GTX 980 Ti 30 60 90 120 150 SE +/- 0.17, N = 3 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 101.83 90.94 115.49
GLmark2 Resolution: 1024 x 768 OpenBenchmarking.org Score, More Is Better GLmark2 2020.04 Resolution: 1024 x 768 GTX 970 GTX 980 Ti 2K 4K 6K 8K 10K 9148 11254
GLmark2 Resolution: 800 x 600 OpenBenchmarking.org Score, More Is Better GLmark2 2020.04 Resolution: 800 x 600 GTX 970 GTX 980 Ti 3K 6K 9K 12K 15K 11325 12886
NeatBench Acceleration: GPU OpenBenchmarking.org FPS, More Is Better NeatBench 5 Acceleration: GPU GTX 980 GTX 970 4 8 12 16 20 SE +/- 0.12, N = 3 SE +/- 0.00, N = 3 17.2 15.4
Blender Blend File: Classroom - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Classroom - Compute: NVIDIA OptiX GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 500 1000 1500 2000 2500 SE +/- 0.15, N = 3 SE +/- 0.10, N = 3 SE +/- 0.05, N = 3 SE +/- 241.03, N = 6 423.50 481.80 342.76 2435.50
Blender Blend File: BMW27 - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: BMW27 - Compute: NVIDIA OptiX GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 100 200 300 400 500 SE +/- 0.13, N = 3 SE +/- 0.16, N = 3 SE +/- 0.03, N = 3 SE +/- 40.73, N = 9 151.49 175.44 123.07 460.10
Blender Blend File: Fishy Cat - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Fishy Cat - Compute: CUDA GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 200 400 600 800 1000 SE +/- 0.13, N = 3 SE +/- 0.06, N = 3 SE +/- 0.21, N = 3 SE +/- 107.89, N = 9 328.65 351.06 292.20 931.55
Blender Blend File: Classroom - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Classroom - Compute: CUDA GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 200 400 600 800 1000 SE +/- 0.35, N = 3 SE +/- 0.05, N = 3 SE +/- 0.14, N = 3 SE +/- 92.48, N = 9 421.77 444.58 354.53 880.20
Blender Blend File: BMW27 - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: BMW27 - Compute: CUDA GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 40 80 120 160 200 SE +/- 0.23, N = 3 SE +/- 0.07, N = 3 SE +/- 0.04, N = 3 SE +/- 7.36, N = 9 165.20 173.01 149.21 200.04
PlaidML FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 20 40 60 80 100 SE +/- 0.14, N = 3 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 SE +/- 6.24, N = 12 82.06 73.69 99.51 88.37
PlaidML FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 200 400 600 800 1000 SE +/- 0.50, N = 3 SE +/- 0.32, N = 3 SE +/- 1.62, N = 3 SE +/- 59.39, N = 12 810.31 717.50 1030.27 875.35
NAMD CUDA ATPase Simulation - 327,506 Atoms OpenBenchmarking.org days/ns, Fewer Is Better NAMD CUDA 2.14 ATPase Simulation - 327,506 Atoms GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 0.0791 0.1582 0.2373 0.3164 0.3955 SE +/- 0.00070, N = 3 SE +/- 0.00210, N = 3 SE +/- 0.00249, N = 15 SE +/- 0.01554, N = 15 0.32163 0.35160 0.28182 0.33292
Rodinia Test: OpenCL Particle Filter OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenCL Particle Filter GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 3 6 9 12 15 SE +/- 0.14, N = 3 SE +/- 0.10, N = 3 SE +/- 0.15, N = 4 SE +/- 1.42, N = 12 11.73 13.05 10.10 11.53 1. (CXX) g++ options: -m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl
LeelaChessZero Backend: OpenCL OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.26 Backend: OpenCL GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 2K 4K 6K 8K 10K SE +/- 37.68, N = 3 SE +/- 41.16, N = 3 SE +/- 4.91, N = 3 SE +/- 984.05, N = 6 6275 4862 8291 4631 1. (CXX) g++ options: -flto -pthread
RedShift Demo OpenBenchmarking.org Seconds, Fewer Is Better RedShift Demo 3.0 GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 500 1000 1500 2000 2500 SE +/- 1.76, N = 3 SE +/- 2.19, N = 3 SE +/- 2.40, N = 3 SE +/- 371.60, N = 6 1002 1119 764 2530
cl-mem Benchmark: Read OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Read GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 60 120 180 240 300 SE +/- 0.06, N = 3 SE +/- 0.03, N = 3 SE +/- 0.15, N = 3 SE +/- 12.40, N = 12 165.4 144.6 266.0 178.6 1. (CC) gcc options: -O2 -flto -lOpenCL
cl-mem Benchmark: Copy OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Copy GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 50 100 150 200 250 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 SE +/- 0.23, N = 3 SE +/- 13.47, N = 12 145.0 125.8 217.5 107.8 1. (CC) gcc options: -O2 -flto -lOpenCL
ViennaCL OpenCL LU Factorization OpenBenchmarking.org GFLOPS, More Is Better ViennaCL 1.4.2 OpenCL LU Factorization GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 14 28 42 56 70 SE +/- 0.21, N = 3 SE +/- 0.19, N = 3 SE +/- 0.45, N = 3 SE +/- 5.70, N = 15 63.16 58.65 64.93 40.11 1. (CXX) g++ options: -rdynamic -lOpenCL
GROMACS Water Benchmark OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2020.3 Water Benchmark GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 0.8732 1.7464 2.6196 3.4928 4.366 SE +/- 0.007, N = 3 SE +/- 0.000, N = 3 SE +/- 0.005, N = 3 SE +/- 0.235, N = 9 3.288 3.001 3.881 0.958 1. (CXX) g++ options: -O3 -lpthread -ldl -lrt -lm
Phoronix Test Suite v10.8.4