Xeon E3 1280 v5 m Intel Xeon E3-1280 v5 testing with a MSI Z170A SLI PLUS (MS-7998) v1.0 (2.A0 BIOS) and ASUS AMD Radeon HD 7850 / R7 265 R9 270 1024SP on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2103199-IB-XEONE312846&sro .
Xeon E3 1280 v5 m Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution 1 2 3 Intel Xeon E3-1280 v5 @ 4.00GHz (4 Cores / 8 Threads) MSI Z170A SLI PLUS (MS-7998) v1.0 (2.A0 BIOS) Intel Xeon E3-1200 v5/E3-1500 32GB 256GB TOSHIBA RD400 ASUS AMD Radeon HD 7850 / R7 265 R9 270 1024SP Realtek ALC1150 VA2431 Intel I219-V Ubuntu 20.04 5.9.0-050900rc2daily20200826-generic (x86_64) 20200825 GNOME Shell 3.36.4 X Server 1.20.9 4.5 Mesa 20.0.8 (LLVM 10.0.0) GCC 9.3.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave - CPU Microcode: 0xe2 - Thermald 1.9.1 Python Details - Python 3.8.5 Security Details - itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT vulnerable + mds: Mitigation of Clear buffers; SMT vulnerable + meltdown: Mitigation of PTI + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Mitigation of Microcode + tsx_async_abort: Mitigation of Clear buffers; SMT vulnerable
Xeon E3 1280 v5 m incompact3d: input.i3d 129 Cells Per Direction incompact3d: input.i3d 193 Cells Per Direction simdjson: Kostya simdjson: LargeRand simdjson: PartialTweets simdjson: DistinctUserID aom-av1: Speed 0 Two-Pass aom-av1: Speed 4 Two-Pass aom-av1: Speed 6 Realtime aom-av1: Speed 6 Two-Pass aom-av1: Speed 8 Realtime svt-hevc: 1 - Bosphorus 1080p svt-hevc: 7 - Bosphorus 1080p svt-hevc: 10 - Bosphorus 1080p svt-vp9: VMAF Optimized - Bosphorus 1080p svt-vp9: PSNR/SSIM Optimized - Bosphorus 1080p svt-vp9: Visual Quality Optimized - Bosphorus 1080p build-mesa: Time To Compile build-nodejs: Time To Compile onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU astcenc: Medium astcenc: Thorough astcenc: Exhaustive basis: ETC1S basis: UASTC Level 0 basis: UASTC Level 2 basis: UASTC Level 3 mnn: SqueezeNetV1.0 mnn: resnet-v2-50 mnn: MobileNetV2_224 mnn: mobilenet-v1-1.0 mnn: inception-v3 sysbench: RAM / Memory sysbench: CPU stockfish: Total Time 1 2 3 59.7731183 203.088511 2.36 0.88 3.52 3.99 0.16 3.79 12.31 9.87 62.44 3.08 47.85 101.85 87.85 88.01 69.82 126.263 1106.502 7.98948 12.2753 3.65702 3.28942 20.9020 14.4552 14.4010 20.4976 4.77871 8.07519 7395.55 3948.98 7393.22 3952.65 5.42817 7395.13 3955.37 5.91706 8.9180 33.0108 256.2135 36.774 11.006 72.898 144.169 7.550 45.714 4.033 4.577 55.425 16690.54 7854.28 10343841 59.8612671 202.932215 2.36 0.88 3.53 3.98 0.16 3.79 12.33 9.86 62.54 3.08 47.78 101.85 87.66 88.04 70.00 126.186 1106.636 8.01298 12.1013 3.66132 3.28091 20.9333 14.4014 14.5108 20.5233 4.76344 8.10192 7399.68 3955.17 7401.56 3952.08 5.42101 7396.46 3954.96 5.90636 8.9380 32.9899 256.0918 36.835 10.997 72.911 144.079 7.479 45.354 4.011 4.545 54.835 16726.66 7845.28 10338531 59.8264809 203.135991 2.36 0.88 3.52 3.98 0.16 3.80 12.35 9.84 62.45 3.08 47.79 101.98 87.74 88.06 69.87 126.386 1106.682 7.97862 12.0587 3.65956 3.20399 20.9031 14.4296 14.4641 20.5150 4.78576 8.08423 7394.30 3951.90 7396.59 3955.00 5.38817 7408.91 3957.84 5.90709 8.9296 33.0202 256.1716 36.839 10.992 72.894 144.079 7.458 45.280 4.001 4.534 55.042 16845.49 7843.34 10254227 OpenBenchmarking.org
Xcompact3d Incompact3d Input: input.i3d 129 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 129 Cells Per Direction 1 2 3 13 26 39 52 65 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 59.77 59.86 59.83 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
Xcompact3d Incompact3d Input: input.i3d 193 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 193 Cells Per Direction 1 2 3 40 80 120 160 200 SE +/- 0.09, N = 3 SE +/- 0.09, N = 3 SE +/- 0.13, N = 3 203.09 202.93 203.14 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
simdjson Throughput Test: Kostya OpenBenchmarking.org GB/s, More Is Better simdjson 0.8.2 Throughput Test: Kostya 1 2 3 0.531 1.062 1.593 2.124 2.655 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 2.36 2.36 2.36 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: LargeRandom OpenBenchmarking.org GB/s, More Is Better simdjson 0.8.2 Throughput Test: LargeRandom 1 2 3 0.198 0.396 0.594 0.792 0.99 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.88 0.88 0.88 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: PartialTweets OpenBenchmarking.org GB/s, More Is Better simdjson 0.8.2 Throughput Test: PartialTweets 1 2 3 0.7943 1.5886 2.3829 3.1772 3.9715 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 3.52 3.53 3.52 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: DistinctUserID OpenBenchmarking.org GB/s, More Is Better simdjson 0.8.2 Throughput Test: DistinctUserID 1 2 3 0.8978 1.7956 2.6934 3.5912 4.489 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 3.99 3.98 3.98 1. (CXX) g++ options: -O3 -pthread
AOM AV1 Encoder Mode: Speed 0 Two-Pass OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.1-rc Encoder Mode: Speed 0 Two-Pass 1 2 3 0.036 0.072 0.108 0.144 0.18 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 8 0.16 0.16 0.16 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 4 Two-Pass OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.1-rc Encoder Mode: Speed 4 Two-Pass 1 2 3 0.855 1.71 2.565 3.42 4.275 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 3.79 3.79 3.80 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 6 Realtime OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.1-rc Encoder Mode: Speed 6 Realtime 1 2 3 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 12.31 12.33 12.35 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 6 Two-Pass OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.1-rc Encoder Mode: Speed 6 Two-Pass 1 2 3 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 9.87 9.86 9.84 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 8 Realtime OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.1-rc Encoder Mode: Speed 8 Realtime 1 2 3 14 28 42 56 70 SE +/- 0.16, N = 3 SE +/- 0.14, N = 3 SE +/- 0.08, N = 3 62.44 62.54 62.45 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
SVT-HEVC Tuning: 1 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 1 - Input: Bosphorus 1080p 1 2 3 0.693 1.386 2.079 2.772 3.465 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 3.08 3.08 3.08 1. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt
SVT-HEVC Tuning: 7 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 7 - Input: Bosphorus 1080p 1 2 3 11 22 33 44 55 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.08, N = 3 47.85 47.78 47.79 1. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt
SVT-HEVC Tuning: 10 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 10 - Input: Bosphorus 1080p 1 2 3 20 40 60 80 100 SE +/- 0.14, N = 3 SE +/- 0.10, N = 3 SE +/- 0.10, N = 3 101.85 101.85 101.98 1. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt
SVT-VP9 Tuning: VMAF Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: VMAF Optimized - Input: Bosphorus 1080p 1 2 3 20 40 60 80 100 SE +/- 0.10, N = 3 SE +/- 0.11, N = 3 SE +/- 0.04, N = 3 87.85 87.66 87.74 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
SVT-VP9 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p 1 2 3 20 40 60 80 100 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 SE +/- 0.17, N = 3 88.01 88.04 88.06 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
SVT-VP9 Tuning: Visual Quality Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: Visual Quality Optimized - Input: Bosphorus 1080p 1 2 3 16 32 48 64 80 SE +/- 0.01, N = 3 SE +/- 0.07, N = 3 SE +/- 0.05, N = 3 69.82 70.00 69.87 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
Timed Mesa Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Mesa Compilation 21.0 Time To Compile 1 2 3 30 60 90 120 150 SE +/- 0.14, N = 3 SE +/- 0.05, N = 3 SE +/- 0.09, N = 3 126.26 126.19 126.39
Timed Node.js Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Node.js Compilation 15.11 Time To Compile 1 2 3 200 400 600 800 1000 SE +/- 0.06, N = 3 SE +/- 0.31, N = 3 SE +/- 0.28, N = 3 1106.50 1106.64 1106.68
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU 1 2 3 2 4 6 8 10 SE +/- 0.01467, N = 3 SE +/- 0.02581, N = 3 SE +/- 0.00688, N = 3 7.98948 8.01298 7.97862 MIN: 7.81 MIN: 7.85 MIN: 7.79 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU 1 2 3 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 12.28 12.10 12.06 MIN: 12.09 MIN: 11.78 MIN: 11.91 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.8238 1.6476 2.4714 3.2952 4.119 SE +/- 0.00650, N = 3 SE +/- 0.00209, N = 3 SE +/- 0.00493, N = 3 3.65702 3.66132 3.65956 MIN: 3.61 MIN: 3.61 MIN: 3.62 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.7401 1.4802 2.2203 2.9604 3.7005 SE +/- 0.00631, N = 3 SE +/- 0.01431, N = 3 SE +/- 0.00597, N = 3 3.28942 3.28091 3.20399 MIN: 3.22 MIN: 3.2 MIN: 3.13 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 1 2 3 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 20.90 20.93 20.90 MIN: 20.84 MIN: 20.82 MIN: 20.81 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU 1 2 3 4 8 12 16 20 SE +/- 0.04, N = 3 SE +/- 0.05, N = 3 SE +/- 0.02, N = 3 14.46 14.40 14.43 MIN: 10.63 MIN: 10.62 MIN: 10.61 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU 1 2 3 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 14.40 14.51 14.46 MIN: 14.25 MIN: 14.31 MIN: 14.29 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 1 2 3 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 20.50 20.52 20.52 MIN: 20.32 MIN: 20.29 MIN: 20.32 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU 1 2 3 1.0768 2.1536 3.2304 4.3072 5.384 SE +/- 0.01143, N = 3 SE +/- 0.00463, N = 3 SE +/- 0.01370, N = 3 4.77871 4.76344 4.78576 MIN: 4.73 MIN: 4.73 MIN: 4.73 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU 1 2 3 2 4 6 8 10 SE +/- 0.01132, N = 3 SE +/- 0.01834, N = 3 SE +/- 0.01391, N = 3 8.07519 8.10192 8.08423 MIN: 8.01 MIN: 8.04 MIN: 8.03 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 1 2 3 1600 3200 4800 6400 8000 SE +/- 4.38, N = 3 SE +/- 4.06, N = 3 SE +/- 2.94, N = 3 7395.55 7399.68 7394.30 MIN: 7383.32 MIN: 7381.45 MIN: 7382.12 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 1 2 3 800 1600 2400 3200 4000 SE +/- 1.93, N = 3 SE +/- 7.15, N = 3 SE +/- 1.51, N = 3 3948.98 3955.17 3951.90 MIN: 3941.23 MIN: 3940.03 MIN: 3944.83 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 1 2 3 1600 3200 4800 6400 8000 SE +/- 1.70, N = 3 SE +/- 2.46, N = 3 SE +/- 3.85, N = 3 7393.22 7401.56 7396.59 MIN: 7380.82 MIN: 7389.32 MIN: 7381.63 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 1 2 3 800 1600 2400 3200 4000 SE +/- 2.90, N = 3 SE +/- 2.76, N = 3 SE +/- 2.54, N = 3 3952.65 3952.08 3955.00 MIN: 3942.22 MIN: 3941.26 MIN: 3946.66 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 1 2 3 1.2213 2.4426 3.6639 4.8852 6.1065 SE +/- 0.00412, N = 3 SE +/- 0.00660, N = 3 SE +/- 0.00691, N = 3 5.42817 5.42101 5.38817 MIN: 5.36 MIN: 5.36 MIN: 5.33 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 1600 3200 4800 6400 8000 SE +/- 2.52, N = 3 SE +/- 5.16, N = 3 SE +/- 15.57, N = 3 7395.13 7396.46 7408.91 MIN: 7381.08 MIN: 7381.2 MIN: 7383.82 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 800 1600 2400 3200 4000 SE +/- 2.53, N = 3 SE +/- 0.47, N = 3 SE +/- 4.66, N = 3 3955.37 3954.96 3957.84 MIN: 3945.55 MIN: 3947.75 MIN: 3944.06 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU 1 2 3 1.3313 2.6626 3.9939 5.3252 6.6565 SE +/- 0.00433, N = 3 SE +/- 0.00128, N = 3 SE +/- 0.00186, N = 3 5.91706 5.90636 5.90709 MIN: 5.87 MIN: 5.87 MIN: 5.87 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
ASTC Encoder Preset: Medium OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.4 Preset: Medium 1 2 3 2 4 6 8 10 SE +/- 0.0175, N = 3 SE +/- 0.0040, N = 3 SE +/- 0.0096, N = 3 8.9180 8.9380 8.9296 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Thorough OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.4 Preset: Thorough 1 2 3 8 16 24 32 40 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 33.01 32.99 33.02 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.4 Preset: Exhaustive 1 2 3 60 120 180 240 300 SE +/- 0.05, N = 3 SE +/- 0.00, N = 3 SE +/- 0.05, N = 3 256.21 256.09 256.17 1. (CXX) g++ options: -O3 -flto -pthread
Basis Universal Settings: ETC1S OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: ETC1S 1 2 3 8 16 24 32 40 SE +/- 0.03, N = 3 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 36.77 36.84 36.84 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Basis Universal Settings: UASTC Level 0 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 0 1 2 3 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 11.01 11.00 10.99 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Basis Universal Settings: UASTC Level 2 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 2 1 2 3 16 32 48 64 80 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 72.90 72.91 72.89 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Basis Universal Settings: UASTC Level 3 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 3 1 2 3 30 60 90 120 150 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 144.17 144.08 144.08 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: SqueezeNetV1.0 1 2 3 2 4 6 8 10 SE +/- 0.015, N = 3 SE +/- 0.012, N = 3 SE +/- 0.011, N = 3 7.550 7.479 7.458 MIN: 7.38 / MAX: 30.78 MIN: 7.4 / MAX: 11.93 MIN: 7.38 / MAX: 29.38 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: resnet-v2-50 1 2 3 10 20 30 40 50 SE +/- 0.15, N = 3 SE +/- 0.09, N = 3 SE +/- 0.14, N = 3 45.71 45.35 45.28 MIN: 44.4 / MAX: 74.55 MIN: 45.03 / MAX: 69.72 MIN: 44.97 / MAX: 67.28 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: MobileNetV2_224 1 2 3 0.9074 1.8148 2.7222 3.6296 4.537 SE +/- 0.014, N = 3 SE +/- 0.012, N = 3 SE +/- 0.018, N = 3 4.033 4.011 4.001 MIN: 3.94 / MAX: 25.54 MIN: 3.92 / MAX: 27.37 MIN: 3.91 / MAX: 27.34 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: mobilenet-v1-1.0 1 2 3 1.0298 2.0596 3.0894 4.1192 5.149 SE +/- 0.013, N = 3 SE +/- 0.006, N = 3 SE +/- 0.005, N = 3 4.577 4.545 4.534 MIN: 4.48 / MAX: 26.26 MIN: 4.48 / MAX: 28.03 MIN: 4.49 / MAX: 8.91 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: inception-v3 1 2 3 12 24 36 48 60 SE +/- 0.21, N = 3 SE +/- 0.49, N = 3 SE +/- 0.56, N = 3 55.43 54.84 55.04 MIN: 54.13 / MAX: 79.1 MIN: 53.62 / MAX: 78.53 MIN: 53.69 / MAX: 79.04 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Sysbench Test: RAM / Memory OpenBenchmarking.org MiB/sec, More Is Better Sysbench 1.0.20 Test: RAM / Memory 1 2 3 4K 8K 12K 16K 20K SE +/- 100.94, N = 3 SE +/- 70.80, N = 3 SE +/- 83.93, N = 3 16690.54 16726.66 16845.49 1. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm
Sysbench Test: CPU OpenBenchmarking.org Events Per Second, More Is Better Sysbench 1.0.20 Test: CPU 1 2 3 2K 4K 6K 8K 10K SE +/- 0.37, N = 3 SE +/- 0.62, N = 3 SE +/- 0.29, N = 3 7854.28 7845.28 7843.34 1. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm
Stockfish Total Time OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 13 Total Time 1 2 3 2M 4M 6M 8M 10M SE +/- 62662.98, N = 3 SE +/- 5126.49, N = 3 SE +/- 48708.15, N = 3 10343841 10338531 10254227 1. (CXX) g++ options: -lgcov -m64 -lpthread -fno-exceptions -std=c++17 -fprofile-use -fno-peel-loops -fno-tracer -pedantic -O3 -msse -msse3 -mpopcnt -mavx2 -msse4.1 -mssse3 -msse2 -mbmi2 -flto -flto=jobserver
Phoronix Test Suite v10.8.5