ml-run1 AMD Ryzen Threadripper 2920X 12-Core testing with a MSI X399 SLI PLUS (MS-7B09) v2.0 (A.70 BIOS) and ASUS NVIDIA GeForce RTX 2080 Ti 11GB on Ubuntu 18.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2008010-NE-MLRUN145899 .
ml-run1 Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL OpenCL Compiler File-System Screen Resolution ml-run1 AMD Ryzen Threadripper 2920X 12-Core (12 Cores / 24 Threads) MSI X399 SLI PLUS (MS-7B09) v2.0 (A.70 BIOS) AMD 17h 64GB 1000GB Samsung SSD 970 EVO 1TB ASUS NVIDIA GeForce RTX 2080 Ti 11GB (1350/7000MHz) Realtek ALC1220 E24 Intel I211 Ubuntu 18.04 5.4.0-42-generic (x86_64) GNOME Shell 3.28.4 X Server 1.20.8 NVIDIA 440.100 4.6.0 OpenCL 1.2 CUDA 10.2.185 GCC 7.5.0 ext4 1920x1080 OpenBenchmarking.org - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic --without-cuda-driver -v - CPU Microcode: 0x800820b - GPU Compute Cores: 4352 - Python 2.7.17 + Python 3.6.9 - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + srbds: Not affected + tsx_async_abort: Not affected
ml-run1 onednn: IP Batch 1D - f32 - CPU onednn: IP Batch All - f32 - CPU onednn: IP Batch 1D - u8s8f32 - CPU onednn: IP Batch All - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch deconv_1d - f32 - CPU onednn: Deconvolution Batch deconv_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch deconv_1d - u8s8f32 - CPU onednn: Deconvolution Batch deconv_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU numpy: deepspeech: rbenchmark: tensorflow: Cifar10 plaidml: No - Inference - VGG16 - CPU plaidml: No - Inference - ResNet 50 - CPU numenta-nab: EXPoSE numenta-nab: Relative Entropy numenta-nab: Windowed Gaussian numenta-nab: Earthgecko Skyline numenta-nab: Bayesian Changepoint mlpack: scikit_ica mlpack: scikit_qda mlpack: scikit_svm mlpack: scikit_linearridgeregression scikit-learn: ml-run1 5.39728 71.4652 4.44911 48.0483 10.5290 5.84482 9.17598 13.0359 7.83018 7.07491 457.178 93.7784 2.99577 2.80790 287.74 88.64066 0.2495 81.02 11.62 4.90 941.262 20.475 9.601 113.030 50.119 62.19 175.63 14.19 6.09 14.729 OpenBenchmarking.org
oneDNN Harness: IP Batch 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: IP Batch 1D - Data Type: f32 - Engine: CPU ml-run1 1.2144 2.4288 3.6432 4.8576 6.072 SE +/- 0.06856, N = 3 5.39728 MIN: 4.9 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Batch All - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: IP Batch All - Data Type: f32 - Engine: CPU ml-run1 16 32 48 64 80 SE +/- 0.36, N = 3 71.47 MIN: 66.97 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Batch 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: IP Batch 1D - Data Type: u8s8f32 - Engine: CPU ml-run1 1.001 2.002 3.003 4.004 5.005 SE +/- 0.01155, N = 3 4.44911 MIN: 4.24 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Batch All - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: IP Batch All - Data Type: u8s8f32 - Engine: CPU ml-run1 11 22 33 44 55 SE +/- 0.15, N = 3 48.05 MIN: 46.2 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU ml-run1 3 6 9 12 15 SE +/- 0.01, N = 3 10.53 MIN: 10.19 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch deconv_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Deconvolution Batch deconv_1d - Data Type: f32 - Engine: CPU ml-run1 1.3151 2.6302 3.9453 5.2604 6.5755 SE +/- 0.02020, N = 3 5.84482 MIN: 5.37 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch deconv_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Deconvolution Batch deconv_3d - Data Type: f32 - Engine: CPU ml-run1 3 6 9 12 15 SE +/- 0.07358, N = 3 9.17598 MIN: 8.71 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU ml-run1 3 6 9 12 15 SE +/- 0.02, N = 3 13.04 MIN: 11.44 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch deconv_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Deconvolution Batch deconv_1d - Data Type: u8s8f32 - Engine: CPU ml-run1 2 4 6 8 10 SE +/- 0.04665, N = 3 7.83018 MIN: 7.13 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch deconv_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Deconvolution Batch deconv_3d - Data Type: u8s8f32 - Engine: CPU ml-run1 2 4 6 8 10 SE +/- 0.00995, N = 3 7.07491 MIN: 6.88 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU ml-run1 100 200 300 400 500 SE +/- 1.26, N = 3 457.18 MIN: 434.86 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU ml-run1 20 40 60 80 100 SE +/- 0.14, N = 3 93.78 MIN: 89.36 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU ml-run1 0.674 1.348 2.022 2.696 3.37 SE +/- 0.00327, N = 3 2.99577 MIN: 2.84 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU ml-run1 0.6318 1.2636 1.8954 2.5272 3.159 SE +/- 0.00980, N = 3 2.80790 MIN: 2.58 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Numpy Benchmark OpenBenchmarking.org Score, More Is Better Numpy Benchmark ml-run1 60 120 180 240 300 SE +/- 0.10, N = 3 287.74
DeepSpeech OpenBenchmarking.org Seconds, Fewer Is Better DeepSpeech 0.6 ml-run1 20 40 60 80 100 SE +/- 0.60, N = 3 88.64
R Benchmark OpenBenchmarking.org Seconds, Fewer Is Better R Benchmark ml-run1 0.0561 0.1122 0.1683 0.2244 0.2805 SE +/- 0.0008, N = 3 0.2495 1. R scripting front-end version 3.4.4 (2018-03-15)
Tensorflow Build: Cifar10 OpenBenchmarking.org Seconds, Fewer Is Better Tensorflow Build: Cifar10 ml-run1 20 40 60 80 100 SE +/- 0.12, N = 3 81.02
PlaidML FP16: No - Mode: Inference - Network: VGG16 - Device: CPU OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: VGG16 - Device: CPU ml-run1 3 6 9 12 15 SE +/- 0.16, N = 3 11.62
PlaidML FP16: No - Mode: Inference - Network: ResNet 50 - Device: CPU OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: ResNet 50 - Device: CPU ml-run1 1.1025 2.205 3.3075 4.41 5.5125 SE +/- 0.01, N = 3 4.90
Numenta Anomaly Benchmark Detector: EXPoSE OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: EXPoSE ml-run1 200 400 600 800 1000 SE +/- 9.98, N = 3 941.26
Numenta Anomaly Benchmark Detector: Relative Entropy OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Relative Entropy ml-run1 5 10 15 20 25 SE +/- 0.35, N = 3 20.48
Numenta Anomaly Benchmark Detector: Windowed Gaussian OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Windowed Gaussian ml-run1 3 6 9 12 15 SE +/- 0.068, N = 3 9.601
Numenta Anomaly Benchmark Detector: Earthgecko Skyline OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Earthgecko Skyline ml-run1 30 60 90 120 150 SE +/- 1.22, N = 3 113.03
Numenta Anomaly Benchmark Detector: Bayesian Changepoint OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Bayesian Changepoint ml-run1 11 22 33 44 55 SE +/- 0.77, N = 3 50.12
Mlpack Benchmark Benchmark: scikit_ica OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_ica ml-run1 14 28 42 56 70 SE +/- 0.14, N = 3 62.19
Mlpack Benchmark Benchmark: scikit_qda OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_qda ml-run1 40 80 120 160 200 SE +/- 0.31, N = 3 175.63
Mlpack Benchmark Benchmark: scikit_svm OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_svm ml-run1 4 8 12 16 20 SE +/- 0.17, N = 6 14.19
Mlpack Benchmark Benchmark: scikit_linearridgeregression OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_linearridgeregression ml-run1 2 4 6 8 10 SE +/- 0.05, N = 3 6.09
Scikit-Learn OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 0.22.1 ml-run1 4 8 12 16 20 SE +/- 0.03, N = 3 14.73
Phoronix Test Suite v10.8.4