AMD EPYC 7F32 8-Core testing with a Supermicro H11DSi-NT v2.00 (2.1 BIOS) and llvmpipe on Ubuntu 20.04 via the Phoronix Test Suite.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2012274-HA-EPYC7F32L08 EPYC 7F32 Last - Phoronix Test Suite EPYC 7F32 Last AMD EPYC 7F32 8-Core testing with a Supermicro H11DSi-NT v2.00 (2.1 BIOS) and llvmpipe on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2012274-HA-EPYC7F32L08&sor&grw .
EPYC 7F32 Last Processor Motherboard Chipset Memory Disk Graphics Monitor OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution Run 1 Run 2 Run 3 Run 4 AMD EPYC 7F32 8-Core @ 3.70GHz (8 Cores / 16 Threads) Supermicro H11DSi-NT v2.00 (2.1 BIOS) AMD Starship/Matisse 64GB 280GB INTEL SSDPE21D280GA llvmpipe VE228 Ubuntu 20.04 5.8.0-050800rc6daily20200721-generic (x86_64) 20200720 GNOME Shell 3.36.1 X Server 1.20.8 modesetting 1.20.8 3.3 Mesa 20.0.4 (LLVM 9.0.1 128 bits) GCC 9.3.0 ext4 1920x1080 OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Disk Details - NONE / errors=remount-ro,relatime,rw / Block Size: 4096 Processor Details - Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x8301034 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
EPYC 7F32 Last clomp: Static OMP Speedup encode-ape: WAV To APE encode-opus: WAV To Opus Encode encode-wavpack: WAV To WavPack encode-ogg: WAV To Ogg ncnn: CPU - mobilenet ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU - shufflenet-v2 ncnn: CPU - mnasnet ncnn: CPU - efficientnet-b0 ncnn: CPU - blazeface ncnn: CPU - googlenet ncnn: CPU - vgg16 ncnn: CPU - resnet18 ncnn: CPU - alexnet ncnn: CPU - resnet50 ncnn: CPU - yolov4-tiny ncnn: CPU - squeezenet_ssd ncnn: CPU - regnety_400m onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU build2: Time To Compile build-eigen: Time To Compile unpack-firefox: firefox-84.0.source.tar.xz unpack-linux: linux-4.15.tar.xz Run 1 Run 2 Run 3 Run 4 29.8 12.507 7.972 13.731 20.574 19.94 6.87 6.66 9.61 6.14 10.58 3.31 15.65 32.45 11.52 7.58 23.02 27.06 24.86 32.07 3.54939 6.22116 2.79140 0.788963 4.98064 4.94409 6.42420 9.66757 7.20000 5.58298 3604.40 1708.29 3429.54 1711.39 1.16439 3417.91 1704.97 3.59741 112.078 82.883 20.372 5.974 29.6 12.512 7.973 13.731 20.574 19.81 6.84 6.64 9.63 5.94 10.55 3.34 15.52 32.41 11.57 7.60 22.96 26.71 24.48 32.19 3.55093 6.15155 2.78976 0.795590 4.92337 4.96823 6.42196 9.62355 7.21888 5.57986 3436.43 1708.22 3438.99 1718.14 1.15868 3441.93 1715.65 3.59581 112.778 82.951 20.273 5.972 29.0 12.489 7.969 13.730 20.583 19.89 6.85 6.62 9.60 5.97 10.46 3.3 15.48 32.32 11.48 7.60 22.89 26.40 24.46 32.35 3.55967 6.12560 2.79003 0.797020 4.95449 4.95793 6.42311 9.46619 7.22277 5.57860 3431.50 1711.14 3436.78 1716.53 1.15947 3441.20 1712.85 3.59103 112.673 83.069 20.221 6.016 29.7 12.493 7.966 13.732 20.555 19.88 6.83 6.60 9.60 6.02 10.48 3.31 16.38 32.31 11.57 7.59 23.04 27.10 25.31 32.03 3.55938 6.17937 2.79287 0.788043 4.92898 4.93690 6.41776 9.59529 7.22008 5.58763 3417.31 1711.87 3427.26 1711.28 1.16157 3421.95 1714.22 3.59799 112.457 82.994 20.210 5.852 OpenBenchmarking.org
CLOMP Static OMP Speedup OpenBenchmarking.org Speedup, More Is Better CLOMP 1.2 Static OMP Speedup Run 1 Run 4 Run 2 Run 3 7 14 21 28 35 SE +/- 0.03, N = 3 SE +/- 0.20, N = 3 SE +/- 0.09, N = 3 SE +/- 0.43, N = 4 29.8 29.7 29.6 29.0 1. (CC) gcc options: -fopenmp -O3 -lm
Monkey Audio Encoding WAV To APE OpenBenchmarking.org Seconds, Fewer Is Better Monkey Audio Encoding 3.99.6 WAV To APE Run 3 Run 4 Run 1 Run 2 3 6 9 12 15 SE +/- 0.01, N = 5 SE +/- 0.01, N = 5 SE +/- 0.02, N = 5 SE +/- 0.01, N = 5 12.49 12.49 12.51 12.51 1. (CXX) g++ options: -O3 -pedantic -rdynamic -lrt
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.3.1 WAV To Opus Encode Run 4 Run 3 Run 1 Run 2 2 4 6 8 10 SE +/- 0.004, N = 5 SE +/- 0.003, N = 5 SE +/- 0.006, N = 5 SE +/- 0.004, N = 5 7.966 7.969 7.972 7.973 1. (CXX) g++ options: -fvisibility=hidden -logg -lm
WavPack Audio Encoding WAV To WavPack OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.3 WAV To WavPack Run 3 Run 1 Run 2 Run 4 4 8 12 16 20 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 13.73 13.73 13.73 13.73 1. (CXX) g++ options: -rdynamic
Ogg Audio Encoding WAV To Ogg OpenBenchmarking.org Seconds, Fewer Is Better Ogg Audio Encoding 1.3.4 WAV To Ogg Run 4 Run 1 Run 2 Run 3 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 20.56 20.57 20.57 20.58 1. (CC) gcc options: -O2 -ffast-math -fsigned-char
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mobilenet Run 2 Run 4 Run 3 Run 1 5 10 15 20 25 SE +/- 0.03, N = 3 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 SE +/- 0.08, N = 3 19.81 19.88 19.89 19.94 MIN: 19.49 / MAX: 21.97 MIN: 19.31 / MAX: 21.58 MIN: 19.44 / MAX: 23.59 MIN: 19.41 / MAX: 35.06 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v2-v2 - Model: mobilenet-v2 Run 4 Run 2 Run 3 Run 1 2 4 6 8 10 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 6.83 6.84 6.85 6.87 MIN: 6.59 / MAX: 9.34 MIN: 6.59 / MAX: 9.23 MIN: 6.55 / MAX: 10 MIN: 6.58 / MAX: 10.87 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v3-v3 - Model: mobilenet-v3 Run 4 Run 3 Run 2 Run 1 2 4 6 8 10 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 6.60 6.62 6.64 6.66 MIN: 6.44 / MAX: 9.53 MIN: 6.46 / MAX: 9.56 MIN: 6.46 / MAX: 9.61 MIN: 6.41 / MAX: 41.46 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: shufflenet-v2 Run 3 Run 4 Run 1 Run 2 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 9.60 9.60 9.61 9.63 MIN: 9.47 / MAX: 11.88 MIN: 9.49 / MAX: 11.95 MIN: 9.48 / MAX: 12.56 MIN: 9.46 / MAX: 12.11 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mnasnet Run 2 Run 3 Run 4 Run 1 2 4 6 8 10 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.10, N = 3 SE +/- 0.20, N = 3 5.94 5.97 6.02 6.14 MIN: 5.8 / MAX: 6.68 MIN: 5.77 / MAX: 6.68 MIN: 5.77 / MAX: 28.61 MIN: 5.8 / MAX: 6.7 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: efficientnet-b0 Run 3 Run 4 Run 2 Run 1 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.12, N = 3 10.46 10.48 10.55 10.58 MIN: 10.28 / MAX: 12.58 MIN: 10.32 / MAX: 10.67 MIN: 10.36 / MAX: 11.1 MIN: 10.27 / MAX: 68.23 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: blazeface Run 3 Run 1 Run 4 Run 2 0.7515 1.503 2.2545 3.006 3.7575 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 3.30 3.31 3.31 3.34 MIN: 3.24 / MAX: 3.48 MIN: 3.21 / MAX: 3.54 MIN: 3.18 / MAX: 3.49 MIN: 3.22 / MAX: 3.58 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: googlenet Run 3 Run 2 Run 1 Run 4 4 8 12 16 20 SE +/- 0.03, N = 3 SE +/- 0.17, N = 3 SE +/- 0.14, N = 3 SE +/- 0.48, N = 3 15.48 15.52 15.65 16.38 MIN: 15.01 / MAX: 16.98 MIN: 15 / MAX: 17.92 MIN: 15.06 / MAX: 18.04 MIN: 15.06 / MAX: 17.41 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: vgg16 Run 4 Run 3 Run 2 Run 1 8 16 24 32 40 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.12, N = 3 SE +/- 0.12, N = 3 32.31 32.32 32.41 32.45 MIN: 32.04 / MAX: 33.23 MIN: 32.09 / MAX: 33.9 MIN: 32.09 / MAX: 92.65 MIN: 32.07 / MAX: 33.99 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet18 Run 3 Run 1 Run 2 Run 4 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.06, N = 3 SE +/- 0.05, N = 3 SE +/- 0.07, N = 3 11.48 11.52 11.57 11.57 MIN: 11.3 / MAX: 12.02 MIN: 11.24 / MAX: 12.11 MIN: 11.32 / MAX: 27.4 MIN: 11.31 / MAX: 12.34 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: alexnet Run 1 Run 4 Run 2 Run 3 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 7.58 7.59 7.60 7.60 MIN: 7.45 / MAX: 8.3 MIN: 7.47 / MAX: 10.36 MIN: 7.47 / MAX: 9.72 MIN: 7.49 / MAX: 8.32 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet50 Run 3 Run 2 Run 1 Run 4 6 12 18 24 30 SE +/- 0.05, N = 3 SE +/- 0.05, N = 3 SE +/- 0.07, N = 3 SE +/- 0.06, N = 3 22.89 22.96 23.02 23.04 MIN: 22.6 / MAX: 24.13 MIN: 22.62 / MAX: 25.08 MIN: 22.64 / MAX: 24.07 MIN: 22.6 / MAX: 25.32 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: yolov4-tiny Run 3 Run 2 Run 1 Run 4 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.33, N = 3 SE +/- 0.43, N = 3 SE +/- 0.24, N = 3 26.40 26.71 27.06 27.10 MIN: 25.98 / MAX: 38.47 MIN: 26.02 / MAX: 29.99 MIN: 25.93 / MAX: 29.4 MIN: 26.06 / MAX: 74.45 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: squeezenet_ssd Run 3 Run 2 Run 1 Run 4 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.38, N = 3 SE +/- 0.42, N = 3 24.46 24.48 24.86 25.31 MIN: 23.88 / MAX: 25.94 MIN: 23.9 / MAX: 25.96 MIN: 23.86 / MAX: 26.49 MIN: 23.93 / MAX: 26.27 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: regnety_400m Run 4 Run 1 Run 2 Run 3 8 16 24 32 40 SE +/- 0.31, N = 3 SE +/- 0.36, N = 3 SE +/- 0.17, N = 3 SE +/- 0.09, N = 3 32.03 32.07 32.19 32.35 MIN: 31.01 / MAX: 79.48 MIN: 30.83 / MAX: 33.73 MIN: 31.55 / MAX: 81.79 MIN: 31.73 / MAX: 34.27 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU Run 1 Run 2 Run 4 Run 3 0.8009 1.6018 2.4027 3.2036 4.0045 SE +/- 0.01364, N = 3 SE +/- 0.00496, N = 3 SE +/- 0.00543, N = 3 SE +/- 0.00416, N = 3 3.54939 3.55093 3.55938 3.55967 MIN: 3.43 MIN: 3.42 MIN: 3.44 MIN: 3.43 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU Run 3 Run 2 Run 4 Run 1 2 4 6 8 10 SE +/- 0.00438, N = 3 SE +/- 0.02019, N = 3 SE +/- 0.01738, N = 3 SE +/- 0.07572, N = 5 6.12560 6.15155 6.17937 6.22116 MIN: 5.96 MIN: 5.98 MIN: 5.99 MIN: 5.83 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU Run 2 Run 3 Run 1 Run 4 0.6284 1.2568 1.8852 2.5136 3.142 SE +/- 0.00373, N = 3 SE +/- 0.00303, N = 3 SE +/- 0.00493, N = 3 SE +/- 0.00526, N = 3 2.78976 2.79003 2.79140 2.79287 MIN: 2.75 MIN: 2.73 MIN: 2.74 MIN: 2.75 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU Run 4 Run 1 Run 2 Run 3 0.1793 0.3586 0.5379 0.7172 0.8965 SE +/- 0.005391, N = 3 SE +/- 0.010611, N = 3 SE +/- 0.002909, N = 3 SE +/- 0.004078, N = 3 0.788043 0.788963 0.795590 0.797020 MIN: 0.74 MIN: 0.74 MIN: 0.74 MIN: 0.74 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU Run 2 Run 4 Run 3 Run 1 1.1206 2.2412 3.3618 4.4824 5.603 SE +/- 0.01299, N = 3 SE +/- 0.02121, N = 3 SE +/- 0.00727, N = 3 SE +/- 0.03423, N = 3 4.92337 4.92898 4.95449 4.98064 MIN: 4.84 MIN: 4.84 MIN: 4.85 MIN: 4.85 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU Run 4 Run 1 Run 3 Run 2 1.1179 2.2358 3.3537 4.4716 5.5895 SE +/- 0.02701, N = 3 SE +/- 0.02948, N = 3 SE +/- 0.02466, N = 3 SE +/- 0.03713, N = 3 4.93690 4.94409 4.95793 4.96823 MIN: 4.84 MIN: 4.83 MIN: 4.83 MIN: 4.84 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU Run 4 Run 2 Run 3 Run 1 2 4 6 8 10 SE +/- 0.00380, N = 3 SE +/- 0.00463, N = 3 SE +/- 0.00067, N = 3 SE +/- 0.00261, N = 3 6.41776 6.42196 6.42311 6.42420 MIN: 6.38 MIN: 6.38 MIN: 6.38 MIN: 6.37 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU Run 3 Run 4 Run 2 Run 1 3 6 9 12 15 SE +/- 0.03907, N = 3 SE +/- 0.15046, N = 3 SE +/- 0.10697, N = 3 SE +/- 0.11376, N = 3 9.46619 9.59529 9.62355 9.66757 MIN: 8.89 MIN: 8.94 MIN: 8.91 MIN: 8.92 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU Run 1 Run 2 Run 4 Run 3 2 4 6 8 10 SE +/- 0.02807, N = 3 SE +/- 0.03613, N = 3 SE +/- 0.01934, N = 3 SE +/- 0.01281, N = 3 7.20000 7.21888 7.22008 7.22277 MIN: 6.99 MIN: 6.99 MIN: 7.02 MIN: 6.95 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU Run 3 Run 2 Run 1 Run 4 1.2572 2.5144 3.7716 5.0288 6.286 SE +/- 0.00444, N = 3 SE +/- 0.00050, N = 3 SE +/- 0.00232, N = 3 SE +/- 0.00320, N = 3 5.57860 5.57986 5.58298 5.58763 MIN: 5.54 MIN: 5.53 MIN: 5.53 MIN: 5.53 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU Run 4 Run 3 Run 2 Run 1 800 1600 2400 3200 4000 SE +/- 1.19, N = 3 SE +/- 5.38, N = 3 SE +/- 1.74, N = 3 SE +/- 121.83, N = 15 3417.31 3431.50 3436.43 3604.40 MIN: 3402.75 MIN: 3412.54 MIN: 3416.4 MIN: 3398.06 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU Run 2 Run 1 Run 3 Run 4 400 800 1200 1600 2000 SE +/- 1.85, N = 3 SE +/- 2.12, N = 3 SE +/- 2.87, N = 3 SE +/- 3.16, N = 3 1708.22 1708.29 1711.14 1711.87 MIN: 1693.22 MIN: 1692.22 MIN: 1695.84 MIN: 1697.11 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU Run 4 Run 1 Run 3 Run 2 700 1400 2100 2800 3500 SE +/- 0.62, N = 3 SE +/- 4.12, N = 3 SE +/- 1.55, N = 3 SE +/- 1.62, N = 3 3427.26 3429.54 3436.78 3438.99 MIN: 3408.14 MIN: 3403.74 MIN: 3417.66 MIN: 3421.47 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU Run 4 Run 1 Run 3 Run 2 400 800 1200 1600 2000 SE +/- 1.34, N = 3 SE +/- 1.39, N = 3 SE +/- 0.71, N = 3 SE +/- 0.28, N = 3 1711.28 1711.39 1716.53 1718.14 MIN: 1700.32 MIN: 1698.74 MIN: 1705.92 MIN: 1705.47 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU Run 2 Run 3 Run 4 Run 1 0.262 0.524 0.786 1.048 1.31 SE +/- 0.00251, N = 3 SE +/- 0.00183, N = 3 SE +/- 0.00127, N = 3 SE +/- 0.00362, N = 3 1.15868 1.15947 1.16157 1.16439 MIN: 1.13 MIN: 1.13 MIN: 1.14 MIN: 1.13 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU Run 1 Run 4 Run 3 Run 2 700 1400 2100 2800 3500 SE +/- 5.02, N = 3 SE +/- 4.16, N = 3 SE +/- 4.25, N = 3 SE +/- 2.14, N = 3 3417.91 3421.95 3441.20 3441.93 MIN: 3393.28 MIN: 3400.56 MIN: 3419.28 MIN: 3416.43 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU Run 1 Run 3 Run 4 Run 2 400 800 1200 1600 2000 SE +/- 2.72, N = 3 SE +/- 3.55, N = 3 SE +/- 4.21, N = 3 SE +/- 0.67, N = 3 1704.97 1712.85 1714.22 1715.65 MIN: 1692.31 MIN: 1698.24 MIN: 1701.35 MIN: 1699.62 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU Run 3 Run 2 Run 1 Run 4 0.8095 1.619 2.4285 3.238 4.0475 SE +/- 0.00094, N = 3 SE +/- 0.00105, N = 3 SE +/- 0.00436, N = 3 SE +/- 0.00461, N = 3 3.59103 3.59581 3.59741 3.59799 MIN: 3.53 MIN: 3.52 MIN: 3.52 MIN: 3.53 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.13 Time To Compile Run 1 Run 4 Run 3 Run 2 30 60 90 120 150 SE +/- 0.41, N = 3 SE +/- 0.42, N = 3 SE +/- 0.88, N = 3 SE +/- 0.31, N = 3 112.08 112.46 112.67 112.78
Timed Eigen Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Eigen Compilation 3.3.9 Time To Compile Run 1 Run 2 Run 4 Run 3 20 40 60 80 100 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 82.88 82.95 82.99 83.07
Unpacking Firefox Extracting: firefox-84.0.source.tar.xz OpenBenchmarking.org Seconds, Fewer Is Better Unpacking Firefox 84.0 Extracting: firefox-84.0.source.tar.xz Run 4 Run 3 Run 2 Run 1 5 10 15 20 25 SE +/- 0.05, N = 4 SE +/- 0.07, N = 4 SE +/- 0.06, N = 4 SE +/- 0.05, N = 4 20.21 20.22 20.27 20.37
Unpacking The Linux Kernel linux-4.15.tar.xz OpenBenchmarking.org Seconds, Fewer Is Better Unpacking The Linux Kernel linux-4.15.tar.xz Run 4 Run 2 Run 1 Run 3 2 4 6 8 10 SE +/- 0.066, N = 4 SE +/- 0.075, N = 5 SE +/- 0.046, N = 4 SE +/- 0.030, N = 4 5.852 5.972 5.974 6.016
Phoronix Test Suite v10.8.4