NGC AMD Ryzen 9 3950X 16-Core testing with a ASUS ROG CROSSHAIR VIII HERO (WI-FI) (1302 BIOS) and Zotac NVIDIA GeForce GTX 1070 Ti 8GB on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2009179-FI-NGC40795300&grr .
NGC Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL OpenCL Vulkan Compiler File-System Screen Resolution GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti AMD Ryzen 9 3950X 16-Core @ 3.50GHz (16 Cores / 32 Threads) ASUS ROG CROSSHAIR VIII HERO (WI-FI) (1302 BIOS) AMD Starship/Matisse 16GB 2000GB Corsair Force MP600 + 2000GB NVIDIA GeForce GTX 980 4GB (1126/3505MHz) NVIDIA GM204 HD Audio DELL P2415Q Realtek RTL8125 2.5GbE + Intel I211 + Intel Wi-Fi 6 AX200 Ubuntu 20.04 5.4.0-47-generic (x86_64) GNOME Shell 3.36.4 X Server 1.20.8 NVIDIA 450.66 4.6.0 OpenCL 1.2 CUDA 11.0.228 + OpenCL 2.0 AMD-APP (3182.0) 1.2.133 GCC 9.3.0 + CUDA 11.0 ext4 3840x2160 eVGA NVIDIA GeForce GTX 970 4GB (1163/3505MHz) NVIDIA GeForce GTX 980 Ti 6GB (999/3505MHz) NVIDIA GM200 HD Audio Zotac NVIDIA GeForce GTX 1070 Ti 8GB (139/4006MHz) NVIDIA GP104 HD Audio OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - GTX 980: Scaling Governor: acpi-cpufreq performance - CPU Microcode: 0x8701013 - GTX 970: Scaling Governor: acpi-cpufreq ondemand - CPU Microcode: 0x8701013 - GTX 980 Ti: Scaling Governor: acpi-cpufreq performance - CPU Microcode: 0x8701013 - GTX 1070 Ti: Scaling Governor: acpi-cpufreq ondemand - CPU Microcode: 0x8701013 OpenCL Details - GTX 980: GPU Compute Cores: 2048 - GTX 970: GPU Compute Cores: 1664 - GTX 980 Ti: GPU Compute Cores: 2816 - GTX 1070 Ti: GPU Compute Cores: 2432 Python Details - Python 3.8.2 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
NGC blender: Barbershop - CUDA redshift: blender: Classroom - NVIDIA OptiX blender: Classroom - CUDA blender: Fishy Cat - CUDA blender: Pabellon Barcelona - CUDA blender: Pabellon Barcelona - NVIDIA OptiX gromacs-gpu: Water Benchmark blender: BMW27 - NVIDIA OptiX lczero: OpenCL blender: Fishy Cat - NVIDIA OptiX blender: BMW27 - CUDA octanebench: Total Score glmark2: 800 x 600 glmark2: 3840 x 2160 glmark2: 1280 x 1024 glmark2: 1920 x 1080 glmark2: 1600 x 1200 glmark2: 1024 x 768 glmark2: 2560 x 1440 glmark2: 1920 x 1200 fahbench: plaidml: No - Inference - DenseNet 201 - OpenCL namd-cuda: ATPase Simulation - 327,506 Atoms plaidml: No - Training - Mobilenet - OpenCL clpeak: Double-Precision Double cl-mem: Copy rodinia: OpenCL Particle Filter cl-mem: Read plaidml: No - Inference - IMDB LSTM - OpenCL plaidml: No - Inference - Mobilenet - OpenCL cl-mem: Write mandelgpu: GPU viennacl: OpenCL LU Factorization plaidml: Yes - Inference - Mobilenet - OpenCL clpeak: Integer Compute INT arrayfire: Conjugate Gradient OpenCL clpeak: Single-Precision Float clpeak: Global Memory Bandwidth neatbench: GPU financebench: Black-Scholes OpenCL GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 1415.17 1002 423.50 421.77 328.65 918.07 858.55 3.288 151.49 6275 319.37 165.20 110.594254 101.8257 82.06 0.32163 80.43 159.64 145.0 11.729 165.4 228.97 810.31 157 126002111.2 63.1567 1053.40 1303.29 4.883 4448.81 165.23 17.2 12.019 1540.95 1119 481.80 444.58 351.06 1012.28 1031.90 3.001 175.44 4862 369.54 173.01 96.072862 11325 2057 7007 5601 5688 9148 3805 5185 90.9439 73.69 0.35160 73.01 137.81 125.8 13.049 144.6 202.94 717.50 134.0 112885853.7 58.6527 909.50 1145.39 5.403 3877.42 144.51 15.4 14.198 1193.54 764 342.76 354.53 292.20 780.37 690.29 3.881 123.07 8291 258.49 149.21 143.092739 12886 2933 8967 7363 7472 11254 5293 6916 115.4876 99.51 0.28182 89.44 196.28 217.5 10.095 266.0 266.29 1030.27 244.2 152063380.5 64.9328 1281.97 1615.04 3.623 5556.00 264.34 10.061 3430.14 2530 2435.50 880.20 931.55 0.958 460.10 4631 200.04 24.871588 88.37 0.33292 104.88 107.8 11.525 178.6 387.56 875.35 194.4 40.1134 1607.15 5.115 79.121999 OpenBenchmarking.org
Blender Blend File: Barbershop - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Barbershop - Compute: CUDA GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 700 1400 2100 2800 3500 SE +/- 0.23, N = 3 SE +/- 0.20, N = 3 SE +/- 0.91, N = 3 1415.17 1540.95 1193.54 3430.14
RedShift Demo OpenBenchmarking.org Seconds, Fewer Is Better RedShift Demo 3.0 GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 500 1000 1500 2000 2500 SE +/- 1.76, N = 3 SE +/- 2.19, N = 3 SE +/- 2.40, N = 3 SE +/- 371.60, N = 6 1002 1119 764 2530
Blender Blend File: Classroom - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Classroom - Compute: NVIDIA OptiX GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 500 1000 1500 2000 2500 SE +/- 0.15, N = 3 SE +/- 0.10, N = 3 SE +/- 0.05, N = 3 SE +/- 241.03, N = 6 423.50 481.80 342.76 2435.50
Blender Blend File: Classroom - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Classroom - Compute: CUDA GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 200 400 600 800 1000 SE +/- 0.35, N = 3 SE +/- 0.05, N = 3 SE +/- 0.14, N = 3 SE +/- 92.48, N = 9 421.77 444.58 354.53 880.20
Blender Blend File: Fishy Cat - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Fishy Cat - Compute: CUDA GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 200 400 600 800 1000 SE +/- 0.13, N = 3 SE +/- 0.06, N = 3 SE +/- 0.21, N = 3 SE +/- 107.89, N = 9 328.65 351.06 292.20 931.55
Blender Blend File: Pabellon Barcelona - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Pabellon Barcelona - Compute: CUDA GTX 980 GTX 970 GTX 980 Ti 200 400 600 800 1000 SE +/- 0.13, N = 3 SE +/- 12.17, N = 3 SE +/- 0.09, N = 3 918.07 1012.28 780.37
Blender Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX GTX 980 GTX 970 GTX 980 Ti 200 400 600 800 1000 SE +/- 0.33, N = 3 SE +/- 0.90, N = 3 SE +/- 0.36, N = 3 858.55 1031.90 690.29
GROMACS Water Benchmark OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2020.3 Water Benchmark GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 0.8732 1.7464 2.6196 3.4928 4.366 SE +/- 0.007, N = 3 SE +/- 0.000, N = 3 SE +/- 0.005, N = 3 SE +/- 0.235, N = 9 3.288 3.001 3.881 0.958 1. (CXX) g++ options: -O3 -lpthread -ldl -lrt -lm
Blender Blend File: BMW27 - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: BMW27 - Compute: NVIDIA OptiX GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 100 200 300 400 500 SE +/- 0.13, N = 3 SE +/- 0.16, N = 3 SE +/- 0.03, N = 3 SE +/- 40.73, N = 9 151.49 175.44 123.07 460.10
LeelaChessZero Backend: OpenCL OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.26 Backend: OpenCL GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 2K 4K 6K 8K 10K SE +/- 37.68, N = 3 SE +/- 41.16, N = 3 SE +/- 4.91, N = 3 SE +/- 984.05, N = 6 6275 4862 8291 4631 1. (CXX) g++ options: -flto -pthread
Blender Blend File: Fishy Cat - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Fishy Cat - Compute: NVIDIA OptiX GTX 980 GTX 970 GTX 980 Ti 80 160 240 320 400 SE +/- 0.12, N = 3 SE +/- 0.03, N = 3 SE +/- 0.09, N = 3 319.37 369.54 258.49
Blender Blend File: BMW27 - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: BMW27 - Compute: CUDA GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 40 80 120 160 200 SE +/- 0.23, N = 3 SE +/- 0.07, N = 3 SE +/- 0.04, N = 3 SE +/- 7.36, N = 9 165.20 173.01 149.21 200.04
OctaneBench Total Score OpenBenchmarking.org Score, More Is Better OctaneBench 4.00c Total Score GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 30 60 90 120 150 110.59 96.07 143.09 24.87
GLmark2 Resolution: 800 x 600 OpenBenchmarking.org Score, More Is Better GLmark2 2020.04 Resolution: 800 x 600 GTX 970 GTX 980 Ti 3K 6K 9K 12K 15K 11325 12886
GLmark2 Resolution: 3840 x 2160 OpenBenchmarking.org Score, More Is Better GLmark2 2020.04 Resolution: 3840 x 2160 GTX 970 GTX 980 Ti 600 1200 1800 2400 3000 2057 2933
GLmark2 Resolution: 1280 x 1024 OpenBenchmarking.org Score, More Is Better GLmark2 2020.04 Resolution: 1280 x 1024 GTX 970 GTX 980 Ti 2K 4K 6K 8K 10K 7007 8967
GLmark2 Resolution: 1920 x 1080 OpenBenchmarking.org Score, More Is Better GLmark2 2020.04 Resolution: 1920 x 1080 GTX 970 GTX 980 Ti 1600 3200 4800 6400 8000 5601 7363
GLmark2 Resolution: 1600 x 1200 OpenBenchmarking.org Score, More Is Better GLmark2 2020.04 Resolution: 1600 x 1200 GTX 970 GTX 980 Ti 1600 3200 4800 6400 8000 5688 7472
GLmark2 Resolution: 1024 x 768 OpenBenchmarking.org Score, More Is Better GLmark2 2020.04 Resolution: 1024 x 768 GTX 970 GTX 980 Ti 2K 4K 6K 8K 10K 9148 11254
GLmark2 Resolution: 2560 x 1440 OpenBenchmarking.org Score, More Is Better GLmark2 2020.04 Resolution: 2560 x 1440 GTX 970 GTX 980 Ti 1100 2200 3300 4400 5500 3805 5293
GLmark2 Resolution: 1920 x 1200 OpenBenchmarking.org Score, More Is Better GLmark2 2020.04 Resolution: 1920 x 1200 GTX 970 GTX 980 Ti 1500 3000 4500 6000 7500 5185 6916
FAHBench OpenBenchmarking.org Ns Per Day, More Is Better FAHBench 2.3.2 GTX 980 GTX 970 GTX 980 Ti 30 60 90 120 150 SE +/- 0.17, N = 3 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 101.83 90.94 115.49
PlaidML FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 20 40 60 80 100 SE +/- 0.14, N = 3 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 SE +/- 6.24, N = 12 82.06 73.69 99.51 88.37
NAMD CUDA ATPase Simulation - 327,506 Atoms OpenBenchmarking.org days/ns, Fewer Is Better NAMD CUDA 2.14 ATPase Simulation - 327,506 Atoms GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 0.0791 0.1582 0.2373 0.3164 0.3955 SE +/- 0.00070, N = 3 SE +/- 0.00210, N = 3 SE +/- 0.00249, N = 15 SE +/- 0.01554, N = 15 0.32163 0.35160 0.28182 0.33292
PlaidML FP16: No - Mode: Training - Network: Mobilenet - Device: OpenCL OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Training - Network: Mobilenet - Device: OpenCL GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 20 40 60 80 100 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.12, N = 3 SE +/- 0.03, N = 3 80.43 73.01 89.44 104.88
clpeak OpenCL Test: Double-Precision Double OpenBenchmarking.org GFLOPS, More Is Better clpeak OpenCL Test: Double-Precision Double GTX 980 GTX 970 GTX 980 Ti 40 80 120 160 200 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 159.64 137.81 196.28 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
cl-mem Benchmark: Copy OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Copy GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 50 100 150 200 250 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 SE +/- 0.23, N = 3 SE +/- 13.47, N = 12 145.0 125.8 217.5 107.8 1. (CC) gcc options: -O2 -flto -lOpenCL
Rodinia Test: OpenCL Particle Filter OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenCL Particle Filter GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 3 6 9 12 15 SE +/- 0.14, N = 3 SE +/- 0.10, N = 3 SE +/- 0.15, N = 4 SE +/- 1.42, N = 12 11.73 13.05 10.10 11.53 1. (CXX) g++ options: -m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl
cl-mem Benchmark: Read OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Read GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 60 120 180 240 300 SE +/- 0.06, N = 3 SE +/- 0.03, N = 3 SE +/- 0.15, N = 3 SE +/- 12.40, N = 12 165.4 144.6 266.0 178.6 1. (CC) gcc options: -O2 -flto -lOpenCL
PlaidML FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 80 160 240 320 400 SE +/- 0.16, N = 3 SE +/- 0.01, N = 3 SE +/- 0.11, N = 3 SE +/- 0.12, N = 3 228.97 202.94 266.29 387.56
PlaidML FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 200 400 600 800 1000 SE +/- 0.50, N = 3 SE +/- 0.32, N = 3 SE +/- 1.62, N = 3 SE +/- 59.39, N = 12 810.31 717.50 1030.27 875.35
cl-mem Benchmark: Write OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Write GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 50 100 150 200 250 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 SE +/- 0.92, N = 3 157.0 134.0 244.2 194.4 1. (CC) gcc options: -O2 -flto -lOpenCL
MandelGPU OpenCL Device: GPU OpenBenchmarking.org Samples/sec, More Is Better MandelGPU 1.3pts1 OpenCL Device: GPU GTX 980 GTX 970 GTX 980 Ti 30M 60M 90M 120M 150M SE +/- 172505.04, N = 3 SE +/- 116092.47, N = 3 SE +/- 367945.59, N = 3 126002111.2 112885853.7 152063380.5 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
ViennaCL OpenCL LU Factorization OpenBenchmarking.org GFLOPS, More Is Better ViennaCL 1.4.2 OpenCL LU Factorization GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 14 28 42 56 70 SE +/- 0.21, N = 3 SE +/- 0.19, N = 3 SE +/- 0.45, N = 3 SE +/- 5.70, N = 15 63.16 58.65 64.93 40.11 1. (CXX) g++ options: -rdynamic -lOpenCL
PlaidML FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL OpenBenchmarking.org FPS, More Is Better PlaidML FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 300 600 900 1200 1500 SE +/- 1.60, N = 3 SE +/- 1.17, N = 3 SE +/- 0.94, N = 3 SE +/- 2.45, N = 3 1053.40 909.50 1281.97 1607.15
clpeak OpenCL Test: Integer Compute INT OpenBenchmarking.org GIOPS, More Is Better clpeak OpenCL Test: Integer Compute INT GTX 980 GTX 970 GTX 980 Ti 300 600 900 1200 1500 SE +/- 3.33, N = 3 SE +/- 14.81, N = 5 SE +/- 11.49, N = 3 1303.29 1145.39 1615.04 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
ArrayFire Test: Conjugate Gradient OpenCL OpenBenchmarking.org ms, Fewer Is Better ArrayFire 3.7 Test: Conjugate Gradient OpenCL GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 1.2157 2.4314 3.6471 4.8628 6.0785 SE +/- 0.003, N = 3 SE +/- 0.004, N = 3 SE +/- 0.004, N = 3 SE +/- 0.062, N = 3 4.883 5.403 3.623 5.115 1. (CXX) g++ options: -rdynamic
clpeak OpenCL Test: Single-Precision Float OpenBenchmarking.org GFLOPS, More Is Better clpeak OpenCL Test: Single-Precision Float GTX 980 GTX 970 GTX 980 Ti 1200 2400 3600 4800 6000 SE +/- 67.71, N = 3 SE +/- 43.66, N = 15 SE +/- 20.25, N = 3 4448.81 3877.42 5556.00 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
clpeak OpenCL Test: Global Memory Bandwidth OpenBenchmarking.org GBPS, More Is Better clpeak OpenCL Test: Global Memory Bandwidth GTX 980 GTX 970 GTX 980 Ti 60 120 180 240 300 SE +/- 0.10, N = 3 SE +/- 0.04, N = 3 SE +/- 0.41, N = 3 165.23 144.51 264.34 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
NeatBench Acceleration: GPU OpenBenchmarking.org FPS, More Is Better NeatBench 5 Acceleration: GPU GTX 980 GTX 970 4 8 12 16 20 SE +/- 0.12, N = 3 SE +/- 0.00, N = 3 17.2 15.4
FinanceBench Benchmark: Black-Scholes OpenCL OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-06-06 Benchmark: Black-Scholes OpenCL GTX 980 GTX 970 GTX 980 Ti GTX 1070 Ti 20 40 60 80 100 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.44, N = 3 12.02 14.20 10.06 79.12 1. (CXX) g++ options: -O3 -lOpenCL
Phoronix Test Suite v10.8.4