2970wx dec AMD Ryzen Threadripper 2970WX 24-Core testing with a Gigabyte X399 AORUS Gaming 7 (F12h BIOS) and Sapphire AMD Radeon RX 550 640SP / 560/560X 4GB on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2012182-HA-2970WXDEC72&grr&rdt .
2970wx dec Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution 1 2 3 AMD Ryzen Threadripper 2970WX 24-Core @ 3.00GHz (24 Cores / 48 Threads) Gigabyte X399 AORUS Gaming 7 (F12h BIOS) AMD 17h 16GB 120GB Corsair Force MP500 Sapphire AMD Radeon RX 550 640SP / 560/560X 4GB (1300/1750MHz) Realtek ALC1220 VA2431 Qualcomm Atheros Killer E2500 + 2 x QLogic cLOM8214 1/10GbE + Intel 8265 / 8275 Ubuntu 20.04 5.9.0-050900rc6daily20200926-generic (x86_64) 20200925 GNOME Shell 3.36.4 X Server 1.20.8 modesetting 1.20.8 4.6 Mesa 20.0.8 (LLVM 10.0.0) GCC 9.3.0 ext4 1920x1080 OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x800820d Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + srbds: Not affected + tsx_async_abort: Not affected
2970wx dec onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU node-web-tooling: hmmer: Pfam Database Search onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU sqlite-speedtest: Timed Time - Size 1,000 simdjson: LargeRand simdjson: PartialTweets simdjson: DistinctUserID onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU simdjson: Kostya onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU coremark: CoreMark Size 666 - Iterations Per Second onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU mafft: Multiple Sequence Alignment - LSU RNA onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU 1 2 3 6633.10 6451.53 9.38 149.789 3594.17 6484.81 3622.89 3597.14 4.19594 69.136 0.37 0.47 0.48 8.51023 0.43 1.44123 865092.285699 1.92621 11.5782 6.76485 2.93562 12.497 4.44169 3.04451 25.0516 20.0101 6.07729 6653.01 6560.67 9.36 150.530 3709.94 6602.60 3702.62 3630.31 4.10165 69.248 0.37 0.47 0.48 8.78312 0.43 1.42362 846971.130291 1.90003 11.9809 6.71480 2.98645 12.619 4.43080 3.05076 25.1820 19.9705 5.92366 6728.90 6664.28 9.38 150.089 3646.05 6641.53 3640.48 3668.58 4.19283 69.057 0.37 0.47 0.48 8.38704 0.43 1.44413 853443.848280 1.88404 11.4454 6.63095 2.91526 12.375 4.58462 2.94226 25.0447 19.9758 5.97994 OpenBenchmarking.org
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 1 2 3 1400 2800 4200 5600 7000 SE +/- 93.02, N = 12 SE +/- 46.42, N = 3 SE +/- 165.28, N = 15 6633.10 6653.01 6728.90 MIN: 5847.18 MIN: 6534.78 MIN: 6041.82 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 1 2 3 1400 2800 4200 5600 7000 SE +/- 93.51, N = 3 SE +/- 78.96, N = 3 SE +/- 72.71, N = 15 6451.53 6560.67 6664.28 MIN: 6172.96 MIN: 6388.99 MIN: 6146.12 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Node.js V8 Web Tooling Benchmark OpenBenchmarking.org runs/s, More Is Better Node.js V8 Web Tooling Benchmark 1 2 3 3 6 9 12 15 SE +/- 0.07, N = 3 SE +/- 0.07, N = 14 SE +/- 0.10, N = 3 9.38 9.36 9.38 1. Nodejs
v10.19.0
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 3.3.1 Pfam Database Search 1 2 3 30 60 90 120 150 SE +/- 0.14, N = 3 SE +/- 0.10, N = 3 SE +/- 0.28, N = 3 149.79 150.53 150.09 1. (CC) gcc options: -O3 -pthread -lhmmer -leasel -lm
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 1 2 3 800 1600 2400 3200 4000 SE +/- 46.45, N = 5 SE +/- 44.18, N = 3 SE +/- 46.32, N = 3 3594.17 3709.94 3646.05 MIN: 3455.7 MIN: 3573.95 MIN: 3532.55 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 1400 2800 4200 5600 7000 SE +/- 90.37, N = 3 SE +/- 23.47, N = 3 SE +/- 25.84, N = 3 6484.81 6602.60 6641.53 MIN: 6295.23 MIN: 6514.96 MIN: 6546.95 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 800 1600 2400 3200 4000 SE +/- 38.47, N = 3 SE +/- 12.34, N = 3 SE +/- 32.17, N = 3 3622.89 3702.62 3640.48 MIN: 3505.26 MIN: 3657.6 MIN: 3497.09 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 1 2 3 800 1600 2400 3200 4000 SE +/- 18.09, N = 3 SE +/- 25.42, N = 3 SE +/- 54.82, N = 3 3597.14 3630.31 3668.58 MIN: 3482.46 MIN: 3562.81 MIN: 3467.02 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU 1 2 3 0.9441 1.8882 2.8323 3.7764 4.7205 SE +/- 0.03827, N = 15 SE +/- 0.05836, N = 4 SE +/- 0.03652, N = 11 4.19594 4.10165 4.19283 MIN: 3.57 MIN: 3.57 MIN: 3.57 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
SQLite Speedtest Timed Time - Size 1,000 OpenBenchmarking.org Seconds, Fewer Is Better SQLite Speedtest 3.30 Timed Time - Size 1,000 1 2 3 15 30 45 60 75 SE +/- 0.28, N = 3 SE +/- 0.17, N = 3 SE +/- 0.26, N = 3 69.14 69.25 69.06 1. (CC) gcc options: -O2 -ldl -lz -lpthread
simdjson Throughput Test: LargeRandom OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: LargeRandom 1 2 3 0.0833 0.1666 0.2499 0.3332 0.4165 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.37 0.37 0.37 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: PartialTweets OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: PartialTweets 1 2 3 0.1058 0.2116 0.3174 0.4232 0.529 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.47 0.47 0.47 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: DistinctUserID OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: DistinctUserID 1 2 3 0.108 0.216 0.324 0.432 0.54 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.48 0.48 0.48 1. (CXX) g++ options: -O3 -pthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU 1 2 3 2 4 6 8 10 SE +/- 0.10996, N = 5 SE +/- 0.09475, N = 15 SE +/- 0.11923, N = 4 8.51023 8.78312 8.38704 MIN: 7.87 MIN: 7.86 MIN: 7.89 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
simdjson Throughput Test: Kostya OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: Kostya 1 2 3 0.0968 0.1936 0.2904 0.3872 0.484 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.43 0.43 0.43 1. (CXX) g++ options: -O3 -pthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 1 2 3 0.3249 0.6498 0.9747 1.2996 1.6245 SE +/- 0.02698, N = 14 SE +/- 0.00296, N = 3 SE +/- 0.00890, N = 3 1.44123 1.42362 1.44413 MIN: 1.19 MIN: 1.22 MIN: 1.23 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Coremark CoreMark Size 666 - Iterations Per Second OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second 1 2 3 200K 400K 600K 800K 1000K SE +/- 9940.03, N = 3 SE +/- 6878.65, N = 3 SE +/- 5068.60, N = 3 865092.29 846971.13 853443.85 1. (CC) gcc options: -O2 -lrt" -lrt
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.4334 0.8668 1.3002 1.7336 2.167 SE +/- 0.01123, N = 3 SE +/- 0.02536, N = 5 SE +/- 0.02268, N = 6 1.92621 1.90003 1.88404 MIN: 1.69 MIN: 1.54 MIN: 1.55 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU 1 2 3 3 6 9 12 15 SE +/- 0.04, N = 3 SE +/- 0.11, N = 10 SE +/- 0.18, N = 3 11.58 11.98 11.45 MIN: 10.49 MIN: 10.95 MIN: 10.56 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU 1 2 3 2 4 6 8 10 SE +/- 0.02229, N = 3 SE +/- 0.02265, N = 3 SE +/- 0.02491, N = 3 6.76485 6.71480 6.63095 MIN: 6 MIN: 5.95 MIN: 5.84 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.672 1.344 2.016 2.688 3.36 SE +/- 0.02616, N = 3 SE +/- 0.02712, N = 3 SE +/- 0.02861, N = 3 2.93562 2.98645 2.91526 MIN: 2.75 MIN: 2.71 MIN: 2.72 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Timed MAFFT Alignment Multiple Sequence Alignment - LSU RNA OpenBenchmarking.org Seconds, Fewer Is Better Timed MAFFT Alignment 7.471 Multiple Sequence Alignment - LSU RNA 1 2 3 3 6 9 12 15 SE +/- 0.11, N = 3 SE +/- 0.15, N = 3 SE +/- 0.12, N = 3 12.50 12.62 12.38 1. (CC) gcc options: -std=c99 -O3 -lm -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU 1 2 3 1.0315 2.063 3.0945 4.126 5.1575 SE +/- 0.07485, N = 12 SE +/- 0.05888, N = 3 SE +/- 0.13249, N = 15 4.44169 4.43080 4.58462 MIN: 4.05 MIN: 4.05 MIN: 4.05 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.6864 1.3728 2.0592 2.7456 3.432 SE +/- 0.02161, N = 3 SE +/- 0.02875, N = 3 SE +/- 0.00209, N = 3 3.04451 3.05076 2.94226 MIN: 2.57 MIN: 2.36 MIN: 2.27 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 1 2 3 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 25.05 25.18 25.04 MIN: 23.37 MIN: 24.06 MIN: 23.83 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 1 2 3 5 10 15 20 25 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 20.01 19.97 19.98 MIN: 14.98 MIN: 14.96 MIN: 15 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU 1 2 3 2 4 6 8 10 SE +/- 0.07612, N = 5 SE +/- 0.00734, N = 3 SE +/- 0.00358, N = 3 6.07729 5.92366 5.97994 MIN: 5.36 MIN: 5.34 MIN: 5.34 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Phoronix Test Suite v10.8.5