9400F mar Intel Core i5-9400F testing with a MSI B360M GAMING PLUS (MS-7B19) v1.0 (1.10 BIOS) and MSI NVIDIA NV106 1GB on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2103186-HA-9400FMAR607&grs .
9400F mar Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution 1 2 3 4 Intel Core i5-9400F @ 4.10GHz (6 Cores) MSI B360M GAMING PLUS (MS-7B19) v1.0 (1.10 BIOS) Intel Cannon Lake PCH 16GB 256GB SAMSUNG MZVPW256HEGL-000H7 MSI NVIDIA NV106 1GB Realtek ALC887-VD G237HL Intel I219-V Ubuntu 20.04 5.9.0-050900rc7daily20200928-generic (x86_64) 20200927 GNOME Shell 3.36.0 X Server 1.20.7 nouveau 4.3 Mesa 20.0.2 GCC 9.3.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave - CPU Microcode: 0xca - Thermald 1.9.1 Security Details - itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT disabled + mds: Mitigation of Clear buffers; SMT disabled + meltdown: Mitigation of PTI + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline IBPB: conditional IBRS_FW STIBP: disabled RSB filling + srbds: Vulnerable: No microcode + tsx_async_abort: Not affected Python Details - 2, 3, 4: Python 3.8.2
9400F mar onednn: IP Shapes 3D - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: IP Shapes 1D - f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU incompact3d: input.i3d 129 Cells Per Direction sysbench: RAM / Memory onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU sysbench: CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU aom-av1: Speed 8 Realtime svt-vp9: PSNR/SSIM Optimized - Bosphorus 1080p mnn: MobileNetV2_224 svt-vp9: VMAF Optimized - Bosphorus 1080p simdjson: LargeRand build-mesa: Time To Compile incompact3d: input.i3d 192 Cells Per Direction mnn: inception-v3 onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU aom-av1: Speed 6 Realtime mnn: SqueezeNetV1.0 svt-hevc: 10 - Bosphorus 1080p svt-vp9: Visual Quality Optimized - Bosphorus 1080p mnn: mobilenet-v1-1.0 onednn: Deconvolution Batch shapes_1d - f32 - CPU basis: UASTC Level 0 aom-av1: Speed 4 Two-Pass simdjson: Kostya mnn: resnet-v2-50 svt-hevc: 7 - Bosphorus 1080p simdjson: DistinctUserID svt-hevc: 1 - Bosphorus 1080p aom-av1: Speed 6 Two-Pass basis: ETC1S onednn: IP Shapes 1D - u8s8f32 - CPU build-nodejs: Time To Compile onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU basis: UASTC Level 2 astcenc: Medium basis: UASTC Level 3 astcenc: Thorough astcenc: Exhaustive aom-av1: Speed 0 Two-Pass simdjson: PartialTweets 1 2 3 4 9.33806 18.1639 2.35835 5.74885 24.0970 50.8227946 12977.84 4.43829 4693.82 2712.79 2710.16 2716.89 8221.43 4688.65 4687.02 9.42941 6.31785 79.67 115.10 3.347 114.66 0.87 99.744 455.435414 31.715 3.11447 16.36 5.302 130.50 93.79 3.509 7.28655 9.753 4.78 2.54 26.946 63.38 3.74 4.39 13.23 32.520 3.31567 890.183 4.95866 53.510 8.1425 106.515 27.2154 208.4311 0.19 3.60 12.5133 19.0459 2.51852 6.08745 24.9312 52.4945691 12774.42 4.59147 4750.44 2746.26 2750.23 2750.92 8047.96 4745.82 4739.81 9.55913 6.24856 78.92 115.03 3.372 114.39 0.87 98.712 457.687053 31.699 3.13783 16.28 5.285 130.96 93.46 3.507 7.25209 9.772 4.76 2.55 26.930 63.31 3.74 4.39 13.21 32.567 3.31990 890.007 4.96304 53.532 8.1432 106.469 27.2210 208.4722 0.19 3.60 12.6719 19.1537 2.51603 6.10535 25.0245 51.3005333 13109.86 4.59240 4748.24 2755.81 2758.35 2756.81 8211.92 4750.15 4742.25 9.50810 6.28776 78.85 114.40 3.396 114.20 0.88 98.769 457.613190 31.833 3.12549 16.28 5.307 130.43 93.81 3.522 7.24862 9.767 4.76 2.54 27.018 63.26 3.74 4.39 13.21 32.550 3.31794 4.95866 53.528 8.1480 106.535 27.2310 208.5418 0.19 3.6 13.5888 19.5227 2.50818 6.08841 25.1555 52.8742867 13237.06 4.58797 4822.05 2785.40 2776.61 2778.13 8218.83 4784.78 4775.31 9.39419 6.21044 78.36 113.34 3.389 113.07 0.87 99.020 459.686717 31.583 3.11856 16.24 5.272 130.15 93.23 3.502 7.25760 9.795 4.76 2.54 26.984 63.20 3.73 4.38 13.2 32.585 3.32207 891.236 4.96107 53.553 8.1484 106.519 27.2295 208.4736 0.19 3.60 OpenBenchmarking.org
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU 1 2 3 4 3 6 9 12 15 SE +/- 0.04042, N = 3 SE +/- 0.05232, N = 3 SE +/- 0.04540, N = 3 SE +/- 0.07677, N = 3 9.33806 12.51330 12.67190 13.58880 MIN: 9.17 MIN: 12.3 MIN: 12.47 MIN: 13.41 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 5 10 15 20 25 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 18.16 19.05 19.15 19.52 MIN: 17.89 MIN: 18.69 MIN: 18.77 MIN: 19.1 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 0.5667 1.1334 1.7001 2.2668 2.8335 SE +/- 0.01460, N = 3 SE +/- 0.01713, N = 3 SE +/- 0.01626, N = 3 SE +/- 0.01463, N = 3 2.35835 2.51852 2.51603 2.50818 MIN: 2.3 MIN: 2.46 MIN: 2.46 MIN: 2.45 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU 1 2 3 4 2 4 6 8 10 SE +/- 0.04456, N = 3 SE +/- 0.03402, N = 3 SE +/- 0.04121, N = 3 SE +/- 0.02504, N = 3 5.74885 6.08745 6.10535 6.08841 MIN: 5.6 MIN: 5.94 MIN: 5.93 MIN: 5.93 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 1 2 3 4 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 24.10 24.93 25.02 25.16 MIN: 23.6 MIN: 24.61 MIN: 24.73 MIN: 24.53 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Xcompact3d Incompact3d Input: input.i3d 129 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 129 Cells Per Direction 1 2 3 4 12 24 36 48 60 SE +/- 0.57, N = 3 SE +/- 0.05, N = 3 SE +/- 0.59, N = 3 SE +/- 0.02, N = 3 50.82 52.49 51.30 52.87 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
Sysbench Test: RAM / Memory OpenBenchmarking.org MiB/sec, More Is Better Sysbench 1.0.20 Test: RAM / Memory 1 2 3 4 3K 6K 9K 12K 15K SE +/- 61.19, N = 3 SE +/- 109.75, N = 3 SE +/- 133.79, N = 3 SE +/- 53.34, N = 3 12977.84 12774.42 13109.86 13237.06 1. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 1 2 3 4 1.0333 2.0666 3.0999 4.1332 5.1665 SE +/- 0.01052, N = 3 SE +/- 0.00822, N = 3 SE +/- 0.00804, N = 3 SE +/- 0.01165, N = 3 4.43829 4.59147 4.59240 4.58797 MIN: 4.36 MIN: 4.52 MIN: 4.52 MIN: 4.52 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 1000 2000 3000 4000 5000 SE +/- 3.50, N = 3 SE +/- 4.18, N = 3 SE +/- 2.95, N = 3 SE +/- 19.23, N = 3 4693.82 4750.44 4748.24 4822.05 MIN: 4566.24 MIN: 4617.15 MIN: 4623.59 MIN: 4660.46 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 1 2 3 4 600 1200 1800 2400 3000 SE +/- 1.94, N = 3 SE +/- 5.11, N = 3 SE +/- 0.72, N = 3 SE +/- 11.46, N = 3 2712.79 2746.26 2755.81 2785.40 MIN: 2634.02 MIN: 2668.84 MIN: 2681.59 MIN: 2704.42 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 600 1200 1800 2400 3000 SE +/- 5.07, N = 3 SE +/- 2.90, N = 3 SE +/- 2.26, N = 3 SE +/- 2.56, N = 3 2710.16 2750.23 2758.35 2776.61 MIN: 2633.06 MIN: 2673.4 MIN: 2683.59 MIN: 2704.96 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 4 600 1200 1800 2400 3000 SE +/- 1.04, N = 3 SE +/- 2.48, N = 3 SE +/- 2.36, N = 3 SE +/- 3.78, N = 3 2716.89 2750.92 2756.81 2778.13 MIN: 2639.02 MIN: 2677.79 MIN: 2683.92 MIN: 2704.44 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Sysbench Test: CPU OpenBenchmarking.org Events Per Second, More Is Better Sysbench 1.0.20 Test: CPU 1 2 3 4 2K 4K 6K 8K 10K SE +/- 0.76, N = 3 SE +/- 111.87, N = 4 SE +/- 2.08, N = 3 SE +/- 2.96, N = 3 8221.43 8047.96 8211.92 8218.83 1. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 4 1000 2000 3000 4000 5000 SE +/- 5.69, N = 3 SE +/- 8.86, N = 3 SE +/- 5.69, N = 3 SE +/- 5.35, N = 3 4688.65 4745.82 4750.15 4784.78 MIN: 4561.77 MIN: 4614.86 MIN: 4624.59 MIN: 4663.88 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 1 2 3 4 1000 2000 3000 4000 5000 SE +/- 9.86, N = 3 SE +/- 2.77, N = 3 SE +/- 6.95, N = 3 SE +/- 8.03, N = 3 4687.02 4739.81 4742.25 4775.31 MIN: 4552.07 MIN: 4612.81 MIN: 4622.61 MIN: 4644.44 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU 1 2 3 4 3 6 9 12 15 SE +/- 0.06124, N = 3 SE +/- 0.01638, N = 3 SE +/- 0.06496, N = 3 SE +/- 0.01310, N = 3 9.42941 9.55913 9.50810 9.39419 MIN: 9.24 MIN: 9.34 MIN: 9.23 MIN: 9.21 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 2 4 6 8 10 SE +/- 0.02640, N = 3 SE +/- 0.00509, N = 3 SE +/- 0.02902, N = 3 SE +/- 0.02410, N = 3 6.31785 6.24856 6.28776 6.21044 MIN: 6.22 MIN: 6.11 MIN: 6.15 MIN: 6.08 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
AOM AV1 Encoder Mode: Speed 8 Realtime OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.1-rc Encoder Mode: Speed 8 Realtime 1 2 3 4 20 40 60 80 100 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.18, N = 3 SE +/- 0.07, N = 3 79.67 78.92 78.85 78.36 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
SVT-VP9 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p 1 2 3 4 30 60 90 120 150 SE +/- 0.33, N = 3 SE +/- 0.23, N = 3 SE +/- 0.04, N = 3 SE +/- 0.26, N = 3 115.10 115.03 114.40 113.34 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: MobileNetV2_224 1 2 3 4 0.7641 1.5282 2.2923 3.0564 3.8205 SE +/- 0.032, N = 3 SE +/- 0.024, N = 3 SE +/- 0.018, N = 3 SE +/- 0.029, N = 3 3.347 3.372 3.396 3.389 MIN: 3.18 / MAX: 13.7 MIN: 3.21 / MAX: 14.08 MIN: 3.24 / MAX: 13.3 MIN: 3.23 / MAX: 14.6 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
SVT-VP9 Tuning: VMAF Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: VMAF Optimized - Input: Bosphorus 1080p 1 2 3 4 30 60 90 120 150 SE +/- 0.39, N = 3 SE +/- 0.46, N = 3 SE +/- 0.18, N = 3 SE +/- 0.07, N = 3 114.66 114.39 114.20 113.07 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
simdjson Throughput Test: LargeRandom OpenBenchmarking.org GB/s, More Is Better simdjson 0.8.2 Throughput Test: LargeRandom 1 2 3 4 0.198 0.396 0.594 0.792 0.99 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.87 0.87 0.88 0.87 1. (CXX) g++ options: -O3 -pthread
Timed Mesa Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Mesa Compilation 21.0 Time To Compile 1 2 3 4 20 40 60 80 100 SE +/- 1.18, N = 3 SE +/- 0.04, N = 3 SE +/- 0.11, N = 3 SE +/- 0.08, N = 3 99.74 98.71 98.77 99.02
Xcompact3d Incompact3d Input: input.i3d 192 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 192 Cells Per Direction 1 2 3 4 100 200 300 400 500 SE +/- 0.04, N = 3 SE +/- 1.04, N = 3 SE +/- 0.78, N = 3 SE +/- 1.01, N = 3 455.44 457.69 457.61 459.69 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: inception-v3 1 2 3 4 7 14 21 28 35 SE +/- 0.06, N = 3 SE +/- 0.10, N = 3 SE +/- 0.11, N = 3 SE +/- 0.05, N = 3 31.72 31.70 31.83 31.58 MIN: 31.44 / MAX: 40.88 MIN: 31.46 / MAX: 41.89 MIN: 31.51 / MAX: 42.73 MIN: 31.39 / MAX: 41.38 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 0.706 1.412 2.118 2.824 3.53 SE +/- 0.00994, N = 3 SE +/- 0.01469, N = 3 SE +/- 0.00251, N = 3 SE +/- 0.00550, N = 3 3.11447 3.13783 3.12549 3.11856 MIN: 3.08 MIN: 3.09 MIN: 3.09 MIN: 3.09 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
AOM AV1 Encoder Mode: Speed 6 Realtime OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.1-rc Encoder Mode: Speed 6 Realtime 1 2 3 4 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 16.36 16.28 16.28 16.24 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: SqueezeNetV1.0 1 2 3 4 1.1941 2.3882 3.5823 4.7764 5.9705 SE +/- 0.068, N = 3 SE +/- 0.041, N = 3 SE +/- 0.031, N = 3 SE +/- 0.031, N = 3 5.302 5.285 5.307 5.272 MIN: 5.09 / MAX: 15.69 MIN: 5.07 / MAX: 16.99 MIN: 5.1 / MAX: 17.06 MIN: 5.07 / MAX: 15.44 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
SVT-HEVC Tuning: 10 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 10 - Input: Bosphorus 1080p 1 2 3 4 30 60 90 120 150 SE +/- 0.37, N = 3 SE +/- 0.44, N = 3 SE +/- 0.42, N = 3 SE +/- 0.41, N = 3 130.50 130.96 130.43 130.15 1. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt
SVT-VP9 Tuning: Visual Quality Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: Visual Quality Optimized - Input: Bosphorus 1080p 1 2 3 4 20 40 60 80 100 SE +/- 0.12, N = 3 SE +/- 0.09, N = 3 SE +/- 0.19, N = 3 SE +/- 0.12, N = 3 93.79 93.46 93.81 93.23 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: mobilenet-v1-1.0 1 2 3 4 0.7925 1.585 2.3775 3.17 3.9625 SE +/- 0.003, N = 3 SE +/- 0.011, N = 3 SE +/- 0.002, N = 3 SE +/- 0.008, N = 3 3.509 3.507 3.522 3.502 MIN: 3.45 / MAX: 13.33 MIN: 3.45 / MAX: 5.69 MIN: 3.46 / MAX: 10.92 MIN: 3.44 / MAX: 4.79 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU 1 2 3 4 2 4 6 8 10 SE +/- 0.02301, N = 3 SE +/- 0.01513, N = 3 SE +/- 0.01801, N = 3 SE +/- 0.01383, N = 3 7.28655 7.25209 7.24862 7.25760 MIN: 7.16 MIN: 7.16 MIN: 7.16 MIN: 7.17 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Basis Universal Settings: UASTC Level 0 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 0 1 2 3 4 3 6 9 12 15 SE +/- 0.009, N = 3 SE +/- 0.008, N = 3 SE +/- 0.004, N = 3 SE +/- 0.006, N = 3 9.753 9.772 9.767 9.795 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
AOM AV1 Encoder Mode: Speed 4 Two-Pass OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.1-rc Encoder Mode: Speed 4 Two-Pass 1 2 3 4 1.0755 2.151 3.2265 4.302 5.3775 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 4.78 4.76 4.76 4.76 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
simdjson Throughput Test: Kostya OpenBenchmarking.org GB/s, More Is Better simdjson 0.8.2 Throughput Test: Kostya 1 2 3 4 0.5738 1.1476 1.7214 2.2952 2.869 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 2.54 2.55 2.54 2.54 1. (CXX) g++ options: -O3 -pthread
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: resnet-v2-50 1 2 3 4 6 12 18 24 30 SE +/- 0.03, N = 3 SE +/- 0.05, N = 3 SE +/- 0.03, N = 3 SE +/- 0.07, N = 3 26.95 26.93 27.02 26.98 MIN: 26.78 / MAX: 42.17 MIN: 26.76 / MAX: 36.61 MIN: 26.85 / MAX: 36.71 MIN: 26.7 / MAX: 37.05 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
SVT-HEVC Tuning: 7 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 7 - Input: Bosphorus 1080p 1 2 3 4 14 28 42 56 70 SE +/- 0.10, N = 3 SE +/- 0.13, N = 3 SE +/- 0.10, N = 3 SE +/- 0.10, N = 3 63.38 63.31 63.26 63.20 1. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt
simdjson Throughput Test: DistinctUserID OpenBenchmarking.org GB/s, More Is Better simdjson 0.8.2 Throughput Test: DistinctUserID 1 2 3 4 0.8415 1.683 2.5245 3.366 4.2075 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 3.74 3.74 3.74 3.73 1. (CXX) g++ options: -O3 -pthread
SVT-HEVC Tuning: 1 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 1 - Input: Bosphorus 1080p 1 2 3 4 0.9878 1.9756 2.9634 3.9512 4.939 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 4.39 4.39 4.39 4.38 1. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt
AOM AV1 Encoder Mode: Speed 6 Two-Pass OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.1-rc Encoder Mode: Speed 6 Two-Pass 1 2 3 4 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 13.23 13.21 13.21 13.20 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Basis Universal Settings: ETC1S OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: ETC1S 1 2 3 4 8 16 24 32 40 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 32.52 32.57 32.55 32.59 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 0.7475 1.495 2.2425 2.99 3.7375 SE +/- 0.01477, N = 3 SE +/- 0.01778, N = 3 SE +/- 0.01808, N = 3 SE +/- 0.01663, N = 3 3.31567 3.31990 3.31794 3.32207 MIN: 3.25 MIN: 3.25 MIN: 3.25 MIN: 3.25 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Timed Node.js Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Node.js Compilation 15.11 Time To Compile 1 2 4 200 400 600 800 1000 SE +/- 0.38, N = 3 SE +/- 0.12, N = 3 SE +/- 0.18, N = 3 890.18 890.01 891.24
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 1.1167 2.2334 3.3501 4.4668 5.5835 SE +/- 0.01510, N = 3 SE +/- 0.00833, N = 3 SE +/- 0.01296, N = 3 SE +/- 0.01510, N = 3 4.95866 4.96304 4.95866 4.96107 MIN: 4.89 MIN: 4.89 MIN: 4.89 MIN: 4.89 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Basis Universal Settings: UASTC Level 2 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 2 1 2 3 4 12 24 36 48 60 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 53.51 53.53 53.53 53.55 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
ASTC Encoder Preset: Medium OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.4 Preset: Medium 1 2 3 4 2 4 6 8 10 SE +/- 0.0093, N = 3 SE +/- 0.0128, N = 3 SE +/- 0.0136, N = 3 SE +/- 0.0037, N = 3 8.1425 8.1432 8.1480 8.1484 1. (CXX) g++ options: -O3 -flto -pthread
Basis Universal Settings: UASTC Level 3 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 3 1 2 3 4 20 40 60 80 100 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 106.52 106.47 106.54 106.52 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
ASTC Encoder Preset: Thorough OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.4 Preset: Thorough 1 2 3 4 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 27.22 27.22 27.23 27.23 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.4 Preset: Exhaustive 1 2 3 4 50 100 150 200 250 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.11, N = 3 SE +/- 0.11, N = 3 208.43 208.47 208.54 208.47 1. (CXX) g++ options: -O3 -flto -pthread
AOM AV1 Encoder Mode: Speed 0 Two-Pass OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.1-rc Encoder Mode: Speed 0 Two-Pass 1 2 3 4 0.0428 0.0856 0.1284 0.1712 0.214 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.19 0.19 0.19 0.19 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
simdjson Throughput Test: PartialTweets OpenBenchmarking.org GB/s, More Is Better simdjson 0.8.2 Throughput Test: PartialTweets 1 2 3 4 0.81 1.62 2.43 3.24 4.05 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 3.60 3.60 3.60 3.60 1. (CXX) g++ options: -O3 -pthread
Phoronix Test Suite v10.8.4