2970wx dec AMD Ryzen Threadripper 2970WX 24-Core testing with a Gigabyte X399 AORUS Gaming 7 (F12h BIOS) and Sapphire AMD Radeon RX 550 640SP / 560/560X 4GB on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2012182-HA-2970WXDEC72&sor&grs .
2970wx dec Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution 1 2 3 AMD Ryzen Threadripper 2970WX 24-Core @ 3.00GHz (24 Cores / 48 Threads) Gigabyte X399 AORUS Gaming 7 (F12h BIOS) AMD 17h 16GB 120GB Corsair Force MP500 Sapphire AMD Radeon RX 550 640SP / 560/560X 4GB (1300/1750MHz) Realtek ALC1220 VA2431 Qualcomm Atheros Killer E2500 + 2 x QLogic cLOM8214 1/10GbE + Intel 8265 / 8275 Ubuntu 20.04 5.9.0-050900rc6daily20200926-generic (x86_64) 20200925 GNOME Shell 3.36.4 X Server 1.20.8 modesetting 1.20.8 4.6 Mesa 20.0.8 (LLVM 10.0.0) GCC 9.3.0 ext4 1920x1080 OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x800820d Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + srbds: Not affected + tsx_async_abort: Not affected
2970wx dec onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU coremark: CoreMark Size 666 - Iterations Per Second onednn: IP Shapes 1D - f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU mafft: Multiple Sequence Alignment - LSU RNA onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU hmmer: Pfam Database Search sqlite-speedtest: Timed Time - Size 1,000 node-web-tooling: onednn: Convolution Batch Shapes Auto - f32 - CPU simdjson: DistinctUserID simdjson: PartialTweets simdjson: LargeRand simdjson: Kostya onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU 1 2 3 8.51023 11.5782 3.04451 6451.53 3594.17 6.07729 2.93562 6484.81 4.19594 1.92621 3622.89 865092.285699 6.76485 3597.14 12.497 25.0516 149.789 69.136 9.38 20.0101 0.48 0.47 0.37 0.43 1.44123 6633.10 4.44169 8.78312 11.9809 3.05076 6560.67 3709.94 5.92366 2.98645 6602.60 4.10165 1.90003 3702.62 846971.130291 6.71480 3630.31 12.619 25.1820 150.530 69.248 9.36 19.9705 0.48 0.47 0.37 0.43 1.42362 6653.01 4.43080 8.38704 11.4454 2.94226 6664.28 3646.05 5.97994 2.91526 6641.53 4.19283 1.88404 3640.48 853443.848280 6.63095 3668.58 12.375 25.0447 150.089 69.057 9.38 19.9758 0.48 0.47 0.37 0.43 1.44413 6728.90 4.58462 OpenBenchmarking.org
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU 3 1 2 2 4 6 8 10 SE +/- 0.11923, N = 4 SE +/- 0.10996, N = 5 SE +/- 0.09475, N = 15 8.38704 8.51023 8.78312 MIN: 7.89 MIN: 7.87 MIN: 7.86 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU 3 1 2 3 6 9 12 15 SE +/- 0.18, N = 3 SE +/- 0.04, N = 3 SE +/- 0.11, N = 10 11.45 11.58 11.98 MIN: 10.56 MIN: 10.49 MIN: 10.95 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU 3 1 2 0.6864 1.3728 2.0592 2.7456 3.432 SE +/- 0.00209, N = 3 SE +/- 0.02161, N = 3 SE +/- 0.02875, N = 3 2.94226 3.04451 3.05076 MIN: 2.27 MIN: 2.57 MIN: 2.36 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 1 2 3 1400 2800 4200 5600 7000 SE +/- 93.51, N = 3 SE +/- 78.96, N = 3 SE +/- 72.71, N = 15 6451.53 6560.67 6664.28 MIN: 6172.96 MIN: 6388.99 MIN: 6146.12 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 1 3 2 800 1600 2400 3200 4000 SE +/- 46.45, N = 5 SE +/- 46.32, N = 3 SE +/- 44.18, N = 3 3594.17 3646.05 3709.94 MIN: 3455.7 MIN: 3532.55 MIN: 3573.95 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU 2 3 1 2 4 6 8 10 SE +/- 0.00734, N = 3 SE +/- 0.00358, N = 3 SE +/- 0.07612, N = 5 5.92366 5.97994 6.07729 MIN: 5.34 MIN: 5.34 MIN: 5.36 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU 3 1 2 0.672 1.344 2.016 2.688 3.36 SE +/- 0.02861, N = 3 SE +/- 0.02616, N = 3 SE +/- 0.02712, N = 3 2.91526 2.93562 2.98645 MIN: 2.72 MIN: 2.75 MIN: 2.71 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 1400 2800 4200 5600 7000 SE +/- 90.37, N = 3 SE +/- 23.47, N = 3 SE +/- 25.84, N = 3 6484.81 6602.60 6641.53 MIN: 6295.23 MIN: 6514.96 MIN: 6546.95 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU 2 3 1 0.9441 1.8882 2.8323 3.7764 4.7205 SE +/- 0.05836, N = 4 SE +/- 0.03652, N = 11 SE +/- 0.03827, N = 15 4.10165 4.19283 4.19594 MIN: 3.57 MIN: 3.57 MIN: 3.57 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU 3 2 1 0.4334 0.8668 1.3002 1.7336 2.167 SE +/- 0.02268, N = 6 SE +/- 0.02536, N = 5 SE +/- 0.01123, N = 3 1.88404 1.90003 1.92621 MIN: 1.55 MIN: 1.54 MIN: 1.69 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU 1 3 2 800 1600 2400 3200 4000 SE +/- 38.47, N = 3 SE +/- 32.17, N = 3 SE +/- 12.34, N = 3 3622.89 3640.48 3702.62 MIN: 3505.26 MIN: 3497.09 MIN: 3657.6 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Coremark CoreMark Size 666 - Iterations Per Second OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second 1 3 2 200K 400K 600K 800K 1000K SE +/- 9940.03, N = 3 SE +/- 5068.60, N = 3 SE +/- 6878.65, N = 3 865092.29 853443.85 846971.13 1. (CC) gcc options: -O2 -lrt" -lrt
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU 3 2 1 2 4 6 8 10 SE +/- 0.02491, N = 3 SE +/- 0.02265, N = 3 SE +/- 0.02229, N = 3 6.63095 6.71480 6.76485 MIN: 5.84 MIN: 5.95 MIN: 6 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 1 2 3 800 1600 2400 3200 4000 SE +/- 18.09, N = 3 SE +/- 25.42, N = 3 SE +/- 54.82, N = 3 3597.14 3630.31 3668.58 MIN: 3482.46 MIN: 3562.81 MIN: 3467.02 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Timed MAFFT Alignment Multiple Sequence Alignment - LSU RNA OpenBenchmarking.org Seconds, Fewer Is Better Timed MAFFT Alignment 7.471 Multiple Sequence Alignment - LSU RNA 3 1 2 3 6 9 12 15 SE +/- 0.12, N = 3 SE +/- 0.11, N = 3 SE +/- 0.15, N = 3 12.38 12.50 12.62 1. (CC) gcc options: -std=c99 -O3 -lm -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 3 1 2 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 25.04 25.05 25.18 MIN: 23.83 MIN: 23.37 MIN: 24.06 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 3.3.1 Pfam Database Search 1 3 2 30 60 90 120 150 SE +/- 0.14, N = 3 SE +/- 0.28, N = 3 SE +/- 0.10, N = 3 149.79 150.09 150.53 1. (CC) gcc options: -O3 -pthread -lhmmer -leasel -lm
SQLite Speedtest Timed Time - Size 1,000 OpenBenchmarking.org Seconds, Fewer Is Better SQLite Speedtest 3.30 Timed Time - Size 1,000 3 1 2 15 30 45 60 75 SE +/- 0.26, N = 3 SE +/- 0.28, N = 3 SE +/- 0.17, N = 3 69.06 69.14 69.25 1. (CC) gcc options: -O2 -ldl -lz -lpthread
Node.js V8 Web Tooling Benchmark OpenBenchmarking.org runs/s, More Is Better Node.js V8 Web Tooling Benchmark 3 1 2 3 6 9 12 15 SE +/- 0.10, N = 3 SE +/- 0.07, N = 3 SE +/- 0.07, N = 14 9.38 9.38 9.36 1. Nodejs
v10.19.0
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 2 3 1 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 19.97 19.98 20.01 MIN: 14.96 MIN: 15 MIN: 14.98 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
simdjson Throughput Test: DistinctUserID OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: DistinctUserID 3 2 1 0.108 0.216 0.324 0.432 0.54 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.48 0.48 0.48 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: PartialTweets OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: PartialTweets 3 2 1 0.1058 0.2116 0.3174 0.4232 0.529 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.47 0.47 0.47 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: LargeRandom OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: LargeRandom 3 2 1 0.0833 0.1666 0.2499 0.3332 0.4165 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.37 0.37 0.37 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: Kostya OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: Kostya 3 2 1 0.0968 0.1936 0.2904 0.3872 0.484 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.43 0.43 0.43 1. (CXX) g++ options: -O3 -pthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 2 1 3 0.3249 0.6498 0.9747 1.2996 1.6245 SE +/- 0.00296, N = 3 SE +/- 0.02698, N = 14 SE +/- 0.00890, N = 3 1.42362 1.44123 1.44413 MIN: 1.22 MIN: 1.19 MIN: 1.23 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 1 2 3 1400 2800 4200 5600 7000 SE +/- 93.02, N = 12 SE +/- 46.42, N = 3 SE +/- 165.28, N = 15 6633.10 6653.01 6728.90 MIN: 5847.18 MIN: 6534.78 MIN: 6041.82 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU 2 1 3 1.0315 2.063 3.0945 4.126 5.1575 SE +/- 0.05888, N = 3 SE +/- 0.07485, N = 12 SE +/- 0.13249, N = 15 4.43080 4.44169 4.58462 MIN: 4.05 MIN: 4.05 MIN: 4.05 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Phoronix Test Suite v10.8.5