Xeon E3 1280 v5 m Intel Xeon E3-1280 v5 testing with a MSI Z170A SLI PLUS (MS-7998) v1.0 (2.A0 BIOS) and ASUS AMD Radeon HD 7850 / R7 265 R9 270 1024SP on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2103199-IB-XEONE312846&grr .
Xeon E3 1280 v5 m Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution 1 2 3 Intel Xeon E3-1280 v5 @ 4.00GHz (4 Cores / 8 Threads) MSI Z170A SLI PLUS (MS-7998) v1.0 (2.A0 BIOS) Intel Xeon E3-1200 v5/E3-1500 32GB 256GB TOSHIBA RD400 ASUS AMD Radeon HD 7850 / R7 265 R9 270 1024SP Realtek ALC1150 VA2431 Intel I219-V Ubuntu 20.04 5.9.0-050900rc2daily20200826-generic (x86_64) 20200825 GNOME Shell 3.36.4 X Server 1.20.9 4.5 Mesa 20.0.8 (LLVM 10.0.0) GCC 9.3.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave - CPU Microcode: 0xe2 - Thermald 1.9.1 Python Details - Python 3.8.5 Security Details - itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT vulnerable + mds: Mitigation of Clear buffers; SMT vulnerable + meltdown: Mitigation of PTI + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Mitigation of Microcode + tsx_async_abort: Mitigation of Clear buffers; SMT vulnerable
Xeon E3 1280 v5 m build-nodejs: Time To Compile astcenc: Exhaustive incompact3d: input.i3d 193 Cells Per Direction aom-av1: Speed 0 Two-Pass svt-hevc: 1 - Bosphorus 1080p aom-av1: Speed 4 Two-Pass basis: UASTC Level 3 mnn: inception-v3 mnn: mobilenet-v1-1.0 mnn: MobileNetV2_224 mnn: resnet-v2-50 mnn: SqueezeNetV1.0 build-mesa: Time To Compile onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU sysbench: CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU aom-av1: Speed 6 Two-Pass simdjson: PartialTweets basis: UASTC Level 2 simdjson: DistinctUserID simdjson: Kostya stockfish: Total Time incompact3d: input.i3d 129 Cells Per Direction simdjson: LargeRand aom-av1: Speed 6 Realtime astcenc: Thorough basis: ETC1S onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU svt-hevc: 7 - Bosphorus 1080p onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU basis: UASTC Level 0 aom-av1: Speed 8 Realtime onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU astcenc: Medium svt-vp9: Visual Quality Optimized - Bosphorus 1080p svt-vp9: VMAF Optimized - Bosphorus 1080p svt-vp9: PSNR/SSIM Optimized - Bosphorus 1080p onednn: Convolution Batch Shapes Auto - f32 - CPU svt-hevc: 10 - Bosphorus 1080p onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU sysbench: RAM / Memory onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU 1 2 3 1106.502 256.2135 203.088511 0.16 3.08 3.79 144.169 55.425 4.577 4.033 45.714 7.550 126.263 7395.13 7395.55 7393.22 7854.28 3948.98 3955.37 3952.65 9.87 3.52 72.898 3.99 2.36 10343841 59.7731183 0.88 12.31 33.0108 36.774 14.4552 4.77871 7.98948 3.65702 47.85 5.42817 5.91706 11.006 62.44 12.2753 3.28942 8.9180 69.82 87.85 88.01 20.9020 101.85 20.4976 16690.54 14.4010 8.07519 1106.636 256.0918 202.932215 0.16 3.08 3.79 144.079 54.835 4.545 4.011 45.354 7.479 126.186 7396.46 7399.68 7401.56 7845.28 3955.17 3954.96 3952.08 9.86 3.53 72.911 3.98 2.36 10338531 59.8612671 0.88 12.33 32.9899 36.835 14.4014 4.76344 8.01298 3.66132 47.78 5.42101 5.90636 10.997 62.54 12.1013 3.28091 8.9380 70.00 87.66 88.04 20.9333 101.85 20.5233 16726.66 14.5108 8.10192 1106.682 256.1716 203.135991 0.16 3.08 3.80 144.079 55.042 4.534 4.001 45.280 7.458 126.386 7408.91 7394.30 7396.59 7843.34 3951.90 3957.84 3955.00 9.84 3.52 72.894 3.98 2.36 10254227 59.8264809 0.88 12.35 33.0202 36.839 14.4296 4.78576 7.97862 3.65956 47.79 5.38817 5.90709 10.992 62.45 12.0587 3.20399 8.9296 69.87 87.74 88.06 20.9031 101.98 20.5150 16845.49 14.4641 8.08423 OpenBenchmarking.org
Timed Node.js Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Node.js Compilation 15.11 Time To Compile 1 2 3 200 400 600 800 1000 SE +/- 0.06, N = 3 SE +/- 0.31, N = 3 SE +/- 0.28, N = 3 1106.50 1106.64 1106.68
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.4 Preset: Exhaustive 1 2 3 60 120 180 240 300 SE +/- 0.05, N = 3 SE +/- 0.00, N = 3 SE +/- 0.05, N = 3 256.21 256.09 256.17 1. (CXX) g++ options: -O3 -flto -pthread
Xcompact3d Incompact3d Input: input.i3d 193 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 193 Cells Per Direction 1 2 3 40 80 120 160 200 SE +/- 0.09, N = 3 SE +/- 0.09, N = 3 SE +/- 0.13, N = 3 203.09 202.93 203.14 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
AOM AV1 Encoder Mode: Speed 0 Two-Pass OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.1-rc Encoder Mode: Speed 0 Two-Pass 1 2 3 0.036 0.072 0.108 0.144 0.18 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 8 0.16 0.16 0.16 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
SVT-HEVC Tuning: 1 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 1 - Input: Bosphorus 1080p 1 2 3 0.693 1.386 2.079 2.772 3.465 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 3.08 3.08 3.08 1. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt
AOM AV1 Encoder Mode: Speed 4 Two-Pass OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.1-rc Encoder Mode: Speed 4 Two-Pass 1 2 3 0.855 1.71 2.565 3.42 4.275 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 3.79 3.79 3.80 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Basis Universal Settings: UASTC Level 3 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 3 1 2 3 30 60 90 120 150 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 144.17 144.08 144.08 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: inception-v3 1 2 3 12 24 36 48 60 SE +/- 0.21, N = 3 SE +/- 0.49, N = 3 SE +/- 0.56, N = 3 55.43 54.84 55.04 MIN: 54.13 / MAX: 79.1 MIN: 53.62 / MAX: 78.53 MIN: 53.69 / MAX: 79.04 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: mobilenet-v1-1.0 1 2 3 1.0298 2.0596 3.0894 4.1192 5.149 SE +/- 0.013, N = 3 SE +/- 0.006, N = 3 SE +/- 0.005, N = 3 4.577 4.545 4.534 MIN: 4.48 / MAX: 26.26 MIN: 4.48 / MAX: 28.03 MIN: 4.49 / MAX: 8.91 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: MobileNetV2_224 1 2 3 0.9074 1.8148 2.7222 3.6296 4.537 SE +/- 0.014, N = 3 SE +/- 0.012, N = 3 SE +/- 0.018, N = 3 4.033 4.011 4.001 MIN: 3.94 / MAX: 25.54 MIN: 3.92 / MAX: 27.37 MIN: 3.91 / MAX: 27.34 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: resnet-v2-50 1 2 3 10 20 30 40 50 SE +/- 0.15, N = 3 SE +/- 0.09, N = 3 SE +/- 0.14, N = 3 45.71 45.35 45.28 MIN: 44.4 / MAX: 74.55 MIN: 45.03 / MAX: 69.72 MIN: 44.97 / MAX: 67.28 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: SqueezeNetV1.0 1 2 3 2 4 6 8 10 SE +/- 0.015, N = 3 SE +/- 0.012, N = 3 SE +/- 0.011, N = 3 7.550 7.479 7.458 MIN: 7.38 / MAX: 30.78 MIN: 7.4 / MAX: 11.93 MIN: 7.38 / MAX: 29.38 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Timed Mesa Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Mesa Compilation 21.0 Time To Compile 1 2 3 30 60 90 120 150 SE +/- 0.14, N = 3 SE +/- 0.05, N = 3 SE +/- 0.09, N = 3 126.26 126.19 126.39
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 1600 3200 4800 6400 8000 SE +/- 2.52, N = 3 SE +/- 5.16, N = 3 SE +/- 15.57, N = 3 7395.13 7396.46 7408.91 MIN: 7381.08 MIN: 7381.2 MIN: 7383.82 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 1 2 3 1600 3200 4800 6400 8000 SE +/- 4.38, N = 3 SE +/- 4.06, N = 3 SE +/- 2.94, N = 3 7395.55 7399.68 7394.30 MIN: 7383.32 MIN: 7381.45 MIN: 7382.12 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 1 2 3 1600 3200 4800 6400 8000 SE +/- 1.70, N = 3 SE +/- 2.46, N = 3 SE +/- 3.85, N = 3 7393.22 7401.56 7396.59 MIN: 7380.82 MIN: 7389.32 MIN: 7381.63 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Sysbench Test: CPU OpenBenchmarking.org Events Per Second, More Is Better Sysbench 1.0.20 Test: CPU 1 2 3 2K 4K 6K 8K 10K SE +/- 0.37, N = 3 SE +/- 0.62, N = 3 SE +/- 0.29, N = 3 7854.28 7845.28 7843.34 1. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 1 2 3 800 1600 2400 3200 4000 SE +/- 1.93, N = 3 SE +/- 7.15, N = 3 SE +/- 1.51, N = 3 3948.98 3955.17 3951.90 MIN: 3941.23 MIN: 3940.03 MIN: 3944.83 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 800 1600 2400 3200 4000 SE +/- 2.53, N = 3 SE +/- 0.47, N = 3 SE +/- 4.66, N = 3 3955.37 3954.96 3957.84 MIN: 3945.55 MIN: 3947.75 MIN: 3944.06 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 1 2 3 800 1600 2400 3200 4000 SE +/- 2.90, N = 3 SE +/- 2.76, N = 3 SE +/- 2.54, N = 3 3952.65 3952.08 3955.00 MIN: 3942.22 MIN: 3941.26 MIN: 3946.66 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
AOM AV1 Encoder Mode: Speed 6 Two-Pass OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.1-rc Encoder Mode: Speed 6 Two-Pass 1 2 3 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 9.87 9.86 9.84 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
simdjson Throughput Test: PartialTweets OpenBenchmarking.org GB/s, More Is Better simdjson 0.8.2 Throughput Test: PartialTweets 1 2 3 0.7943 1.5886 2.3829 3.1772 3.9715 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 3.52 3.53 3.52 1. (CXX) g++ options: -O3 -pthread
Basis Universal Settings: UASTC Level 2 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 2 1 2 3 16 32 48 64 80 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 72.90 72.91 72.89 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
simdjson Throughput Test: DistinctUserID OpenBenchmarking.org GB/s, More Is Better simdjson 0.8.2 Throughput Test: DistinctUserID 1 2 3 0.8978 1.7956 2.6934 3.5912 4.489 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 3.99 3.98 3.98 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: Kostya OpenBenchmarking.org GB/s, More Is Better simdjson 0.8.2 Throughput Test: Kostya 1 2 3 0.531 1.062 1.593 2.124 2.655 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 2.36 2.36 2.36 1. (CXX) g++ options: -O3 -pthread
Stockfish Total Time OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 13 Total Time 1 2 3 2M 4M 6M 8M 10M SE +/- 62662.98, N = 3 SE +/- 5126.49, N = 3 SE +/- 48708.15, N = 3 10343841 10338531 10254227 1. (CXX) g++ options: -lgcov -m64 -lpthread -fno-exceptions -std=c++17 -fprofile-use -fno-peel-loops -fno-tracer -pedantic -O3 -msse -msse3 -mpopcnt -mavx2 -msse4.1 -mssse3 -msse2 -mbmi2 -flto -flto=jobserver
Xcompact3d Incompact3d Input: input.i3d 129 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 129 Cells Per Direction 1 2 3 13 26 39 52 65 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 59.77 59.86 59.83 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
simdjson Throughput Test: LargeRandom OpenBenchmarking.org GB/s, More Is Better simdjson 0.8.2 Throughput Test: LargeRandom 1 2 3 0.198 0.396 0.594 0.792 0.99 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.88 0.88 0.88 1. (CXX) g++ options: -O3 -pthread
AOM AV1 Encoder Mode: Speed 6 Realtime OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.1-rc Encoder Mode: Speed 6 Realtime 1 2 3 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 12.31 12.33 12.35 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
ASTC Encoder Preset: Thorough OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.4 Preset: Thorough 1 2 3 8 16 24 32 40 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 33.01 32.99 33.02 1. (CXX) g++ options: -O3 -flto -pthread
Basis Universal Settings: ETC1S OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: ETC1S 1 2 3 8 16 24 32 40 SE +/- 0.03, N = 3 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 36.77 36.84 36.84 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU 1 2 3 4 8 12 16 20 SE +/- 0.04, N = 3 SE +/- 0.05, N = 3 SE +/- 0.02, N = 3 14.46 14.40 14.43 MIN: 10.63 MIN: 10.62 MIN: 10.61 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU 1 2 3 1.0768 2.1536 3.2304 4.3072 5.384 SE +/- 0.01143, N = 3 SE +/- 0.00463, N = 3 SE +/- 0.01370, N = 3 4.77871 4.76344 4.78576 MIN: 4.73 MIN: 4.73 MIN: 4.73 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU 1 2 3 2 4 6 8 10 SE +/- 0.01467, N = 3 SE +/- 0.02581, N = 3 SE +/- 0.00688, N = 3 7.98948 8.01298 7.97862 MIN: 7.81 MIN: 7.85 MIN: 7.79 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.8238 1.6476 2.4714 3.2952 4.119 SE +/- 0.00650, N = 3 SE +/- 0.00209, N = 3 SE +/- 0.00493, N = 3 3.65702 3.66132 3.65956 MIN: 3.61 MIN: 3.61 MIN: 3.62 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
SVT-HEVC Tuning: 7 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 7 - Input: Bosphorus 1080p 1 2 3 11 22 33 44 55 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.08, N = 3 47.85 47.78 47.79 1. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 1 2 3 1.2213 2.4426 3.6639 4.8852 6.1065 SE +/- 0.00412, N = 3 SE +/- 0.00660, N = 3 SE +/- 0.00691, N = 3 5.42817 5.42101 5.38817 MIN: 5.36 MIN: 5.36 MIN: 5.33 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU 1 2 3 1.3313 2.6626 3.9939 5.3252 6.6565 SE +/- 0.00433, N = 3 SE +/- 0.00128, N = 3 SE +/- 0.00186, N = 3 5.91706 5.90636 5.90709 MIN: 5.87 MIN: 5.87 MIN: 5.87 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Basis Universal Settings: UASTC Level 0 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 0 1 2 3 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 11.01 11.00 10.99 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
AOM AV1 Encoder Mode: Speed 8 Realtime OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.1-rc Encoder Mode: Speed 8 Realtime 1 2 3 14 28 42 56 70 SE +/- 0.16, N = 3 SE +/- 0.14, N = 3 SE +/- 0.08, N = 3 62.44 62.54 62.45 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU 1 2 3 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 12.28 12.10 12.06 MIN: 12.09 MIN: 11.78 MIN: 11.91 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.7401 1.4802 2.2203 2.9604 3.7005 SE +/- 0.00631, N = 3 SE +/- 0.01431, N = 3 SE +/- 0.00597, N = 3 3.28942 3.28091 3.20399 MIN: 3.22 MIN: 3.2 MIN: 3.13 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
ASTC Encoder Preset: Medium OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.4 Preset: Medium 1 2 3 2 4 6 8 10 SE +/- 0.0175, N = 3 SE +/- 0.0040, N = 3 SE +/- 0.0096, N = 3 8.9180 8.9380 8.9296 1. (CXX) g++ options: -O3 -flto -pthread
SVT-VP9 Tuning: Visual Quality Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: Visual Quality Optimized - Input: Bosphorus 1080p 1 2 3 16 32 48 64 80 SE +/- 0.01, N = 3 SE +/- 0.07, N = 3 SE +/- 0.05, N = 3 69.82 70.00 69.87 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
SVT-VP9 Tuning: VMAF Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: VMAF Optimized - Input: Bosphorus 1080p 1 2 3 20 40 60 80 100 SE +/- 0.10, N = 3 SE +/- 0.11, N = 3 SE +/- 0.04, N = 3 87.85 87.66 87.74 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
SVT-VP9 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p 1 2 3 20 40 60 80 100 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 SE +/- 0.17, N = 3 88.01 88.04 88.06 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 1 2 3 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 20.90 20.93 20.90 MIN: 20.84 MIN: 20.82 MIN: 20.81 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
SVT-HEVC Tuning: 10 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 10 - Input: Bosphorus 1080p 1 2 3 20 40 60 80 100 SE +/- 0.14, N = 3 SE +/- 0.10, N = 3 SE +/- 0.10, N = 3 101.85 101.85 101.98 1. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 1 2 3 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 20.50 20.52 20.52 MIN: 20.32 MIN: 20.29 MIN: 20.32 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Sysbench Test: RAM / Memory OpenBenchmarking.org MiB/sec, More Is Better Sysbench 1.0.20 Test: RAM / Memory 1 2 3 4K 8K 12K 16K 20K SE +/- 100.94, N = 3 SE +/- 70.80, N = 3 SE +/- 83.93, N = 3 16690.54 16726.66 16845.49 1. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU 1 2 3 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 14.40 14.51 14.46 MIN: 14.25 MIN: 14.31 MIN: 14.29 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU 1 2 3 2 4 6 8 10 SE +/- 0.01132, N = 3 SE +/- 0.01834, N = 3 SE +/- 0.01391, N = 3 8.07519 8.10192 8.08423 MIN: 8.01 MIN: 8.04 MIN: 8.03 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Phoronix Test Suite v10.8.5