AMD EPYC 7F32 8-Core testing with a Supermicro H11DSi-NT v2.00 (2.1 BIOS) and llvmpipe on Ubuntu 20.04 via the Phoronix Test Suite.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2012274-HA-EPYC7F32L08 EPYC 7F32 Last - Phoronix Test Suite EPYC 7F32 Last AMD EPYC 7F32 8-Core testing with a Supermicro H11DSi-NT v2.00 (2.1 BIOS) and llvmpipe on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2012274-HA-EPYC7F32L08&grs&sor .
EPYC 7F32 Last Processor Motherboard Chipset Memory Disk Graphics Monitor OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution Run 1 Run 2 Run 3 Run 4 AMD EPYC 7F32 8-Core @ 3.70GHz (8 Cores / 16 Threads) Supermicro H11DSi-NT v2.00 (2.1 BIOS) AMD Starship/Matisse 64GB 280GB INTEL SSDPE21D280GA llvmpipe VE228 Ubuntu 20.04 5.8.0-050800rc6daily20200721-generic (x86_64) 20200720 GNOME Shell 3.36.1 X Server 1.20.8 modesetting 1.20.8 3.3 Mesa 20.0.4 (LLVM 9.0.1 128 bits) GCC 9.3.0 ext4 1920x1080 OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Disk Details - NONE / errors=remount-ro,relatime,rw / Block Size: 4096 Processor Details - Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x8301034 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
EPYC 7F32 Last ncnn: CPU - googlenet ncnn: CPU - squeezenet_ssd ncnn: CPU - mnasnet unpack-linux: linux-4.15.tar.xz clomp: Static OMP Speedup ncnn: CPU - yolov4-tiny onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: IP Shapes 3D - f32 - CPU ncnn: CPU - blazeface onednn: Convolution Batch Shapes Auto - f32 - CPU ncnn: CPU - efficientnet-b0 onednn: IP Shapes 3D - u8s8f32 - CPU ncnn: CPU - regnety_400m ncnn: CPU-v3-v3 - mobilenet-v3 unpack-firefox: firefox-84.0.source.tar.xz ncnn: CPU - resnet18 onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU ncnn: CPU - mobilenet ncnn: CPU - resnet50 onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU build2: Time To Compile ncnn: CPU-v2-v2 - mobilenet-v2 onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU ncnn: CPU - vgg16 onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU ncnn: CPU - shufflenet-v2 onednn: IP Shapes 1D - f32 - CPU ncnn: CPU - alexnet build-eigen: Time To Compile onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU encode-ape: WAV To APE onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU encode-ogg: WAV To Ogg onednn: IP Shapes 1D - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU encode-opus: WAV To Opus Encode encode-wavpack: WAV To WavPack onednn: Recurrent Neural Network Training - f32 - CPU Run 1 Run 2 Run 3 Run 4 15.65 24.86 6.14 5.974 29.8 27.06 9.66757 6.22116 3.31 4.98064 10.58 0.788963 32.07 6.66 20.372 11.52 3417.91 19.94 23.02 4.94409 1704.97 112.078 6.87 1.16439 32.45 1711.39 3429.54 7.20000 9.61 3.54939 7.58 82.883 1708.29 3.59741 12.507 5.58298 20.574 2.79140 6.42420 7.972 13.731 3604.40 15.52 24.48 5.94 5.972 29.6 26.71 9.62355 6.15155 3.34 4.92337 10.55 0.795590 32.19 6.64 20.273 11.57 3441.93 19.81 22.96 4.96823 1715.65 112.778 6.84 1.15868 32.41 1718.14 3438.99 7.21888 9.63 3.55093 7.60 82.951 1708.22 3.59581 12.512 5.57986 20.574 2.78976 6.42196 7.973 13.731 3436.43 15.48 24.46 5.97 6.016 29.0 26.40 9.46619 6.12560 3.3 4.95449 10.46 0.797020 32.35 6.62 20.221 11.48 3441.20 19.89 22.89 4.95793 1712.85 112.673 6.85 1.15947 32.32 1716.53 3436.78 7.22277 9.60 3.55967 7.60 83.069 1711.14 3.59103 12.489 5.57860 20.583 2.79003 6.42311 7.969 13.730 3431.50 16.38 25.31 6.02 5.852 29.7 27.10 9.59529 6.17937 3.31 4.92898 10.48 0.788043 32.03 6.60 20.210 11.57 3421.95 19.88 23.04 4.93690 1714.22 112.457 6.83 1.16157 32.31 1711.28 3427.26 7.22008 9.60 3.55938 7.59 82.994 1711.87 3.59799 12.493 5.58763 20.555 2.79287 6.41776 7.966 13.732 3417.31 OpenBenchmarking.org
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: googlenet Run 3 Run 2 Run 1 Run 4 4 8 12 16 20 SE +/- 0.03, N = 3 SE +/- 0.17, N = 3 SE +/- 0.14, N = 3 SE +/- 0.48, N = 3 15.48 15.52 15.65 16.38 MIN: 15.01 / MAX: 16.98 MIN: 15 / MAX: 17.92 MIN: 15.06 / MAX: 18.04 MIN: 15.06 / MAX: 17.41 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: squeezenet_ssd Run 3 Run 2 Run 1 Run 4 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.38, N = 3 SE +/- 0.42, N = 3 24.46 24.48 24.86 25.31 MIN: 23.88 / MAX: 25.94 MIN: 23.9 / MAX: 25.96 MIN: 23.86 / MAX: 26.49 MIN: 23.93 / MAX: 26.27 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mnasnet Run 2 Run 3 Run 4 Run 1 2 4 6 8 10 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.10, N = 3 SE +/- 0.20, N = 3 5.94 5.97 6.02 6.14 MIN: 5.8 / MAX: 6.68 MIN: 5.77 / MAX: 6.68 MIN: 5.77 / MAX: 28.61 MIN: 5.8 / MAX: 6.7 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Unpacking The Linux Kernel linux-4.15.tar.xz OpenBenchmarking.org Seconds, Fewer Is Better Unpacking The Linux Kernel linux-4.15.tar.xz Run 4 Run 2 Run 1 Run 3 2 4 6 8 10 SE +/- 0.066, N = 4 SE +/- 0.075, N = 5 SE +/- 0.046, N = 4 SE +/- 0.030, N = 4 5.852 5.972 5.974 6.016
CLOMP Static OMP Speedup OpenBenchmarking.org Speedup, More Is Better CLOMP 1.2 Static OMP Speedup Run 1 Run 4 Run 2 Run 3 7 14 21 28 35 SE +/- 0.03, N = 3 SE +/- 0.20, N = 3 SE +/- 0.09, N = 3 SE +/- 0.43, N = 4 29.8 29.7 29.6 29.0 1. (CC) gcc options: -fopenmp -O3 -lm
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: yolov4-tiny Run 3 Run 2 Run 1 Run 4 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.33, N = 3 SE +/- 0.43, N = 3 SE +/- 0.24, N = 3 26.40 26.71 27.06 27.10 MIN: 25.98 / MAX: 38.47 MIN: 26.02 / MAX: 29.99 MIN: 25.93 / MAX: 29.4 MIN: 26.06 / MAX: 74.45 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU Run 3 Run 4 Run 2 Run 1 3 6 9 12 15 SE +/- 0.03907, N = 3 SE +/- 0.15046, N = 3 SE +/- 0.10697, N = 3 SE +/- 0.11376, N = 3 9.46619 9.59529 9.62355 9.66757 MIN: 8.89 MIN: 8.94 MIN: 8.91 MIN: 8.92 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU Run 3 Run 2 Run 4 Run 1 2 4 6 8 10 SE +/- 0.00438, N = 3 SE +/- 0.02019, N = 3 SE +/- 0.01738, N = 3 SE +/- 0.07572, N = 5 6.12560 6.15155 6.17937 6.22116 MIN: 5.96 MIN: 5.98 MIN: 5.99 MIN: 5.83 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: blazeface Run 3 Run 1 Run 4 Run 2 0.7515 1.503 2.2545 3.006 3.7575 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 3.30 3.31 3.31 3.34 MIN: 3.24 / MAX: 3.48 MIN: 3.21 / MAX: 3.54 MIN: 3.18 / MAX: 3.49 MIN: 3.22 / MAX: 3.58 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU Run 2 Run 4 Run 3 Run 1 1.1206 2.2412 3.3618 4.4824 5.603 SE +/- 0.01299, N = 3 SE +/- 0.02121, N = 3 SE +/- 0.00727, N = 3 SE +/- 0.03423, N = 3 4.92337 4.92898 4.95449 4.98064 MIN: 4.84 MIN: 4.84 MIN: 4.85 MIN: 4.85 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: efficientnet-b0 Run 3 Run 4 Run 2 Run 1 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.12, N = 3 10.46 10.48 10.55 10.58 MIN: 10.28 / MAX: 12.58 MIN: 10.32 / MAX: 10.67 MIN: 10.36 / MAX: 11.1 MIN: 10.27 / MAX: 68.23 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU Run 4 Run 1 Run 2 Run 3 0.1793 0.3586 0.5379 0.7172 0.8965 SE +/- 0.005391, N = 3 SE +/- 0.010611, N = 3 SE +/- 0.002909, N = 3 SE +/- 0.004078, N = 3 0.788043 0.788963 0.795590 0.797020 MIN: 0.74 MIN: 0.74 MIN: 0.74 MIN: 0.74 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: regnety_400m Run 4 Run 1 Run 2 Run 3 8 16 24 32 40 SE +/- 0.31, N = 3 SE +/- 0.36, N = 3 SE +/- 0.17, N = 3 SE +/- 0.09, N = 3 32.03 32.07 32.19 32.35 MIN: 31.01 / MAX: 79.48 MIN: 30.83 / MAX: 33.73 MIN: 31.55 / MAX: 81.79 MIN: 31.73 / MAX: 34.27 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v3-v3 - Model: mobilenet-v3 Run 4 Run 3 Run 2 Run 1 2 4 6 8 10 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 6.60 6.62 6.64 6.66 MIN: 6.44 / MAX: 9.53 MIN: 6.46 / MAX: 9.56 MIN: 6.46 / MAX: 9.61 MIN: 6.41 / MAX: 41.46 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Unpacking Firefox Extracting: firefox-84.0.source.tar.xz OpenBenchmarking.org Seconds, Fewer Is Better Unpacking Firefox 84.0 Extracting: firefox-84.0.source.tar.xz Run 4 Run 3 Run 2 Run 1 5 10 15 20 25 SE +/- 0.05, N = 4 SE +/- 0.07, N = 4 SE +/- 0.06, N = 4 SE +/- 0.05, N = 4 20.21 20.22 20.27 20.37
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet18 Run 3 Run 1 Run 2 Run 4 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.06, N = 3 SE +/- 0.05, N = 3 SE +/- 0.07, N = 3 11.48 11.52 11.57 11.57 MIN: 11.3 / MAX: 12.02 MIN: 11.24 / MAX: 12.11 MIN: 11.32 / MAX: 27.4 MIN: 11.31 / MAX: 12.34 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU Run 1 Run 4 Run 3 Run 2 700 1400 2100 2800 3500 SE +/- 5.02, N = 3 SE +/- 4.16, N = 3 SE +/- 4.25, N = 3 SE +/- 2.14, N = 3 3417.91 3421.95 3441.20 3441.93 MIN: 3393.28 MIN: 3400.56 MIN: 3419.28 MIN: 3416.43 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mobilenet Run 2 Run 4 Run 3 Run 1 5 10 15 20 25 SE +/- 0.03, N = 3 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 SE +/- 0.08, N = 3 19.81 19.88 19.89 19.94 MIN: 19.49 / MAX: 21.97 MIN: 19.31 / MAX: 21.58 MIN: 19.44 / MAX: 23.59 MIN: 19.41 / MAX: 35.06 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet50 Run 3 Run 2 Run 1 Run 4 6 12 18 24 30 SE +/- 0.05, N = 3 SE +/- 0.05, N = 3 SE +/- 0.07, N = 3 SE +/- 0.06, N = 3 22.89 22.96 23.02 23.04 MIN: 22.6 / MAX: 24.13 MIN: 22.62 / MAX: 25.08 MIN: 22.64 / MAX: 24.07 MIN: 22.6 / MAX: 25.32 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU Run 4 Run 1 Run 3 Run 2 1.1179 2.2358 3.3537 4.4716 5.5895 SE +/- 0.02701, N = 3 SE +/- 0.02948, N = 3 SE +/- 0.02466, N = 3 SE +/- 0.03713, N = 3 4.93690 4.94409 4.95793 4.96823 MIN: 4.84 MIN: 4.83 MIN: 4.83 MIN: 4.84 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU Run 1 Run 3 Run 4 Run 2 400 800 1200 1600 2000 SE +/- 2.72, N = 3 SE +/- 3.55, N = 3 SE +/- 4.21, N = 3 SE +/- 0.67, N = 3 1704.97 1712.85 1714.22 1715.65 MIN: 1692.31 MIN: 1698.24 MIN: 1701.35 MIN: 1699.62 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.13 Time To Compile Run 1 Run 4 Run 3 Run 2 30 60 90 120 150 SE +/- 0.41, N = 3 SE +/- 0.42, N = 3 SE +/- 0.88, N = 3 SE +/- 0.31, N = 3 112.08 112.46 112.67 112.78
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v2-v2 - Model: mobilenet-v2 Run 4 Run 2 Run 3 Run 1 2 4 6 8 10 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 6.83 6.84 6.85 6.87 MIN: 6.59 / MAX: 9.34 MIN: 6.59 / MAX: 9.23 MIN: 6.55 / MAX: 10 MIN: 6.58 / MAX: 10.87 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU Run 2 Run 3 Run 4 Run 1 0.262 0.524 0.786 1.048 1.31 SE +/- 0.00251, N = 3 SE +/- 0.00183, N = 3 SE +/- 0.00127, N = 3 SE +/- 0.00362, N = 3 1.15868 1.15947 1.16157 1.16439 MIN: 1.13 MIN: 1.13 MIN: 1.14 MIN: 1.13 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: vgg16 Run 4 Run 3 Run 2 Run 1 8 16 24 32 40 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.12, N = 3 SE +/- 0.12, N = 3 32.31 32.32 32.41 32.45 MIN: 32.04 / MAX: 33.23 MIN: 32.09 / MAX: 33.9 MIN: 32.09 / MAX: 92.65 MIN: 32.07 / MAX: 33.99 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU Run 4 Run 1 Run 3 Run 2 400 800 1200 1600 2000 SE +/- 1.34, N = 3 SE +/- 1.39, N = 3 SE +/- 0.71, N = 3 SE +/- 0.28, N = 3 1711.28 1711.39 1716.53 1718.14 MIN: 1700.32 MIN: 1698.74 MIN: 1705.92 MIN: 1705.47 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU Run 4 Run 1 Run 3 Run 2 700 1400 2100 2800 3500 SE +/- 0.62, N = 3 SE +/- 4.12, N = 3 SE +/- 1.55, N = 3 SE +/- 1.62, N = 3 3427.26 3429.54 3436.78 3438.99 MIN: 3408.14 MIN: 3403.74 MIN: 3417.66 MIN: 3421.47 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU Run 1 Run 2 Run 4 Run 3 2 4 6 8 10 SE +/- 0.02807, N = 3 SE +/- 0.03613, N = 3 SE +/- 0.01934, N = 3 SE +/- 0.01281, N = 3 7.20000 7.21888 7.22008 7.22277 MIN: 6.99 MIN: 6.99 MIN: 7.02 MIN: 6.95 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: shufflenet-v2 Run 3 Run 4 Run 1 Run 2 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 9.60 9.60 9.61 9.63 MIN: 9.47 / MAX: 11.88 MIN: 9.49 / MAX: 11.95 MIN: 9.48 / MAX: 12.56 MIN: 9.46 / MAX: 12.11 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU Run 1 Run 2 Run 4 Run 3 0.8009 1.6018 2.4027 3.2036 4.0045 SE +/- 0.01364, N = 3 SE +/- 0.00496, N = 3 SE +/- 0.00543, N = 3 SE +/- 0.00416, N = 3 3.54939 3.55093 3.55938 3.55967 MIN: 3.43 MIN: 3.42 MIN: 3.44 MIN: 3.43 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: alexnet Run 1 Run 4 Run 2 Run 3 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 7.58 7.59 7.60 7.60 MIN: 7.45 / MAX: 8.3 MIN: 7.47 / MAX: 10.36 MIN: 7.47 / MAX: 9.72 MIN: 7.49 / MAX: 8.32 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Timed Eigen Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Eigen Compilation 3.3.9 Time To Compile Run 1 Run 2 Run 4 Run 3 20 40 60 80 100 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 82.88 82.95 82.99 83.07
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU Run 2 Run 1 Run 3 Run 4 400 800 1200 1600 2000 SE +/- 1.85, N = 3 SE +/- 2.12, N = 3 SE +/- 2.87, N = 3 SE +/- 3.16, N = 3 1708.22 1708.29 1711.14 1711.87 MIN: 1693.22 MIN: 1692.22 MIN: 1695.84 MIN: 1697.11 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU Run 3 Run 2 Run 1 Run 4 0.8095 1.619 2.4285 3.238 4.0475 SE +/- 0.00094, N = 3 SE +/- 0.00105, N = 3 SE +/- 0.00436, N = 3 SE +/- 0.00461, N = 3 3.59103 3.59581 3.59741 3.59799 MIN: 3.53 MIN: 3.52 MIN: 3.52 MIN: 3.53 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Monkey Audio Encoding WAV To APE OpenBenchmarking.org Seconds, Fewer Is Better Monkey Audio Encoding 3.99.6 WAV To APE Run 3 Run 4 Run 1 Run 2 3 6 9 12 15 SE +/- 0.01, N = 5 SE +/- 0.01, N = 5 SE +/- 0.02, N = 5 SE +/- 0.01, N = 5 12.49 12.49 12.51 12.51 1. (CXX) g++ options: -O3 -pedantic -rdynamic -lrt
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU Run 3 Run 2 Run 1 Run 4 1.2572 2.5144 3.7716 5.0288 6.286 SE +/- 0.00444, N = 3 SE +/- 0.00050, N = 3 SE +/- 0.00232, N = 3 SE +/- 0.00320, N = 3 5.57860 5.57986 5.58298 5.58763 MIN: 5.54 MIN: 5.53 MIN: 5.53 MIN: 5.53 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Ogg Audio Encoding WAV To Ogg OpenBenchmarking.org Seconds, Fewer Is Better Ogg Audio Encoding 1.3.4 WAV To Ogg Run 4 Run 1 Run 2 Run 3 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 20.56 20.57 20.57 20.58 1. (CC) gcc options: -O2 -ffast-math -fsigned-char
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU Run 2 Run 3 Run 1 Run 4 0.6284 1.2568 1.8852 2.5136 3.142 SE +/- 0.00373, N = 3 SE +/- 0.00303, N = 3 SE +/- 0.00493, N = 3 SE +/- 0.00526, N = 3 2.78976 2.79003 2.79140 2.79287 MIN: 2.75 MIN: 2.73 MIN: 2.74 MIN: 2.75 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU Run 4 Run 2 Run 3 Run 1 2 4 6 8 10 SE +/- 0.00380, N = 3 SE +/- 0.00463, N = 3 SE +/- 0.00067, N = 3 SE +/- 0.00261, N = 3 6.41776 6.42196 6.42311 6.42420 MIN: 6.38 MIN: 6.38 MIN: 6.38 MIN: 6.37 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.3.1 WAV To Opus Encode Run 4 Run 3 Run 1 Run 2 2 4 6 8 10 SE +/- 0.004, N = 5 SE +/- 0.003, N = 5 SE +/- 0.006, N = 5 SE +/- 0.004, N = 5 7.966 7.969 7.972 7.973 1. (CXX) g++ options: -fvisibility=hidden -logg -lm
WavPack Audio Encoding WAV To WavPack OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.3 WAV To WavPack Run 3 Run 1 Run 2 Run 4 4 8 12 16 20 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 13.73 13.73 13.73 13.73 1. (CXX) g++ options: -rdynamic
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU Run 4 Run 3 Run 2 Run 1 800 1600 2400 3200 4000 SE +/- 1.19, N = 3 SE +/- 5.38, N = 3 SE +/- 1.74, N = 3 SE +/- 121.83, N = 15 3417.31 3431.50 3436.43 3604.40 MIN: 3402.75 MIN: 3412.54 MIN: 3416.4 MIN: 3398.06 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Phoronix Test Suite v10.8.4