9400F mar Intel Core i5-9400F testing with a MSI B360M GAMING PLUS (MS-7B19) v1.0 (1.10 BIOS) and MSI NVIDIA NV106 1GB on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2103186-HA-9400FMAR607&rdt&grw .
9400F mar Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution 1 2 3 4 Intel Core i5-9400F @ 4.10GHz (6 Cores) MSI B360M GAMING PLUS (MS-7B19) v1.0 (1.10 BIOS) Intel Cannon Lake PCH 16GB 256GB SAMSUNG MZVPW256HEGL-000H7 MSI NVIDIA NV106 1GB Realtek ALC887-VD G237HL Intel I219-V Ubuntu 20.04 5.9.0-050900rc7daily20200928-generic (x86_64) 20200927 GNOME Shell 3.36.0 X Server 1.20.7 nouveau 4.3 Mesa 20.0.2 GCC 9.3.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave - CPU Microcode: 0xca - Thermald 1.9.1 Security Details - itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT disabled + mds: Mitigation of Clear buffers; SMT disabled + meltdown: Mitigation of PTI + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline IBPB: conditional IBRS_FW STIBP: disabled RSB filling + srbds: Vulnerable: No microcode + tsx_async_abort: Not affected Python Details - 2, 3, 4: Python 3.8.2
9400F mar basis: ETC1S basis: UASTC Level 0 basis: UASTC Level 2 basis: UASTC Level 3 astcenc: Medium astcenc: Thorough astcenc: Exhaustive mnn: SqueezeNetV1.0 mnn: resnet-v2-50 mnn: MobileNetV2_224 mnn: mobilenet-v1-1.0 mnn: inception-v3 onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU incompact3d: input.i3d 129 Cells Per Direction incompact3d: input.i3d 192 Cells Per Direction sysbench: RAM / Memory sysbench: CPU aom-av1: Speed 0 Two-Pass aom-av1: Speed 4 Two-Pass aom-av1: Speed 6 Realtime aom-av1: Speed 6 Two-Pass aom-av1: Speed 8 Realtime svt-vp9: VMAF Optimized - Bosphorus 1080p svt-vp9: PSNR/SSIM Optimized - Bosphorus 1080p svt-vp9: Visual Quality Optimized - Bosphorus 1080p svt-hevc: 1 - Bosphorus 1080p svt-hevc: 7 - Bosphorus 1080p svt-hevc: 10 - Bosphorus 1080p build-mesa: Time To Compile build-nodejs: Time To Compile simdjson: Kostya simdjson: LargeRand simdjson: PartialTweets simdjson: DistinctUserID 1 2 3 4 32.520 9.753 53.510 106.515 8.1425 27.2154 208.4311 5.302 26.946 3.347 3.509 31.715 5.74885 9.33806 3.31567 2.35835 24.0970 7.28655 9.42941 18.1639 3.11447 6.31785 4687.02 2712.79 4693.82 2710.16 4.43829 4688.65 2716.89 4.95866 50.8227946 455.435414 12977.84 8221.43 0.19 4.78 16.36 13.23 79.67 114.66 115.10 93.79 4.39 63.38 130.50 99.744 890.183 2.54 0.87 3.60 3.74 32.567 9.772 53.532 106.469 8.1432 27.2210 208.4722 5.285 26.930 3.372 3.507 31.699 6.08745 12.5133 3.31990 2.51852 24.9312 7.25209 9.55913 19.0459 3.13783 6.24856 4739.81 2746.26 4750.44 2750.23 4.59147 4745.82 2750.92 4.96304 52.4945691 457.687053 12774.42 8047.96 0.19 4.76 16.28 13.21 78.92 114.39 115.03 93.46 4.39 63.31 130.96 98.712 890.007 2.55 0.87 3.60 3.74 32.550 9.767 53.528 106.535 8.1480 27.2310 208.5418 5.307 27.018 3.396 3.522 31.833 6.10535 12.6719 3.31794 2.51603 25.0245 7.24862 9.50810 19.1537 3.12549 6.28776 4742.25 2755.81 4748.24 2758.35 4.59240 4750.15 2756.81 4.95866 51.3005333 457.613190 13109.86 8211.92 0.19 4.76 16.28 13.21 78.85 114.20 114.40 93.81 4.39 63.26 130.43 98.769 2.54 0.88 3.6 3.74 32.585 9.795 53.553 106.519 8.1484 27.2295 208.4736 5.272 26.984 3.389 3.502 31.583 6.08841 13.5888 3.32207 2.50818 25.1555 7.25760 9.39419 19.5227 3.11856 6.21044 4775.31 2785.40 4822.05 2776.61 4.58797 4784.78 2778.13 4.96107 52.8742867 459.686717 13237.06 8218.83 0.19 4.76 16.24 13.2 78.36 113.07 113.34 93.23 4.38 63.20 130.15 99.020 891.236 2.54 0.87 3.60 3.73 OpenBenchmarking.org
Basis Universal Settings: ETC1S OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: ETC1S 1 2 3 4 8 16 24 32 40 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 32.52 32.57 32.55 32.59 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Basis Universal Settings: UASTC Level 0 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 0 1 2 3 4 3 6 9 12 15 SE +/- 0.009, N = 3 SE +/- 0.008, N = 3 SE +/- 0.004, N = 3 SE +/- 0.006, N = 3 9.753 9.772 9.767 9.795 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Basis Universal Settings: UASTC Level 2 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 2 1 2 3 4 12 24 36 48 60 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 53.51 53.53 53.53 53.55 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Basis Universal Settings: UASTC Level 3 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 3 1 2 3 4 20 40 60 80 100 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 106.52 106.47 106.54 106.52 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
ASTC Encoder Preset: Medium OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.4 Preset: Medium 1 2 3 4 2 4 6 8 10 SE +/- 0.0093, N = 3 SE +/- 0.0128, N = 3 SE +/- 0.0136, N = 3 SE +/- 0.0037, N = 3 8.1425 8.1432 8.1480 8.1484 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Thorough OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.4 Preset: Thorough 1 2 3 4 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 27.22 27.22 27.23 27.23 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.4 Preset: Exhaustive 1 2 3 4 50 100 150 200 250 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.11, N = 3 SE +/- 0.11, N = 3 208.43 208.47 208.54 208.47 1. (CXX) g++ options: -O3 -flto -pthread
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: SqueezeNetV1.0 1 2 3 4 1.1941 2.3882 3.5823 4.7764 5.9705 SE +/- 0.068, N = 3 SE +/- 0.041, N = 3 SE +/- 0.031, N = 3 SE +/- 0.031, N = 3 5.302 5.285 5.307 5.272 MIN: 5.09 / MAX: 15.69 MIN: 5.07 / MAX: 16.99 MIN: 5.1 / MAX: 17.06 MIN: 5.07 / MAX: 15.44 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: resnet-v2-50 1 2 3 4 6 12 18 24 30 SE +/- 0.03, N = 3 SE +/- 0.05, N = 3 SE +/- 0.03, N = 3 SE +/- 0.07, N = 3 26.95 26.93 27.02 26.98 MIN: 26.78 / MAX: 42.17 MIN: 26.76 / MAX: 36.61 MIN: 26.85 / MAX: 36.71 MIN: 26.7 / MAX: 37.05 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: MobileNetV2_224 1 2 3 4 0.7641 1.5282 2.2923 3.0564 3.8205 SE +/- 0.032, N = 3 SE +/- 0.024, N = 3 SE +/- 0.018, N = 3 SE +/- 0.029, N = 3 3.347 3.372 3.396 3.389 MIN: 3.18 / MAX: 13.7 MIN: 3.21 / MAX: 14.08 MIN: 3.24 / MAX: 13.3 MIN: 3.23 / MAX: 14.6 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: mobilenet-v1-1.0 1 2 3 4 0.7925 1.585 2.3775 3.17 3.9625 SE +/- 0.003, N = 3 SE +/- 0.011, N = 3 SE +/- 0.002, N = 3 SE +/- 0.008, N = 3 3.509 3.507 3.522 3.502 MIN: 3.45 / MAX: 13.33 MIN: 3.45 / MAX: 5.69 MIN: 3.46 / MAX: 10.92 MIN: 3.44 / MAX: 4.79 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: inception-v3 1 2 3 4 7 14 21 28 35 SE +/- 0.06, N = 3 SE +/- 0.10, N = 3 SE +/- 0.11, N = 3 SE +/- 0.05, N = 3 31.72 31.70 31.83 31.58 MIN: 31.44 / MAX: 40.88 MIN: 31.46 / MAX: 41.89 MIN: 31.51 / MAX: 42.73 MIN: 31.39 / MAX: 41.38 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU 1 2 3 4 2 4 6 8 10 SE +/- 0.04456, N = 3 SE +/- 0.03402, N = 3 SE +/- 0.04121, N = 3 SE +/- 0.02504, N = 3 5.74885 6.08745 6.10535 6.08841 MIN: 5.6 MIN: 5.94 MIN: 5.93 MIN: 5.93 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU 1 2 3 4 3 6 9 12 15 SE +/- 0.04042, N = 3 SE +/- 0.05232, N = 3 SE +/- 0.04540, N = 3 SE +/- 0.07677, N = 3 9.33806 12.51330 12.67190 13.58880 MIN: 9.17 MIN: 12.3 MIN: 12.47 MIN: 13.41 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 0.7475 1.495 2.2425 2.99 3.7375 SE +/- 0.01477, N = 3 SE +/- 0.01778, N = 3 SE +/- 0.01808, N = 3 SE +/- 0.01663, N = 3 3.31567 3.31990 3.31794 3.32207 MIN: 3.25 MIN: 3.25 MIN: 3.25 MIN: 3.25 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 0.5667 1.1334 1.7001 2.2668 2.8335 SE +/- 0.01460, N = 3 SE +/- 0.01713, N = 3 SE +/- 0.01626, N = 3 SE +/- 0.01463, N = 3 2.35835 2.51852 2.51603 2.50818 MIN: 2.3 MIN: 2.46 MIN: 2.46 MIN: 2.45 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 1 2 3 4 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 24.10 24.93 25.02 25.16 MIN: 23.6 MIN: 24.61 MIN: 24.73 MIN: 24.53 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU 1 2 3 4 2 4 6 8 10 SE +/- 0.02301, N = 3 SE +/- 0.01513, N = 3 SE +/- 0.01801, N = 3 SE +/- 0.01383, N = 3 7.28655 7.25209 7.24862 7.25760 MIN: 7.16 MIN: 7.16 MIN: 7.16 MIN: 7.17 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU 1 2 3 4 3 6 9 12 15 SE +/- 0.06124, N = 3 SE +/- 0.01638, N = 3 SE +/- 0.06496, N = 3 SE +/- 0.01310, N = 3 9.42941 9.55913 9.50810 9.39419 MIN: 9.24 MIN: 9.34 MIN: 9.23 MIN: 9.21 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 5 10 15 20 25 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 18.16 19.05 19.15 19.52 MIN: 17.89 MIN: 18.69 MIN: 18.77 MIN: 19.1 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 0.706 1.412 2.118 2.824 3.53 SE +/- 0.00994, N = 3 SE +/- 0.01469, N = 3 SE +/- 0.00251, N = 3 SE +/- 0.00550, N = 3 3.11447 3.13783 3.12549 3.11856 MIN: 3.08 MIN: 3.09 MIN: 3.09 MIN: 3.09 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 2 4 6 8 10 SE +/- 0.02640, N = 3 SE +/- 0.00509, N = 3 SE +/- 0.02902, N = 3 SE +/- 0.02410, N = 3 6.31785 6.24856 6.28776 6.21044 MIN: 6.22 MIN: 6.11 MIN: 6.15 MIN: 6.08 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 1 2 3 4 1000 2000 3000 4000 5000 SE +/- 9.86, N = 3 SE +/- 2.77, N = 3 SE +/- 6.95, N = 3 SE +/- 8.03, N = 3 4687.02 4739.81 4742.25 4775.31 MIN: 4552.07 MIN: 4612.81 MIN: 4622.61 MIN: 4644.44 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 1 2 3 4 600 1200 1800 2400 3000 SE +/- 1.94, N = 3 SE +/- 5.11, N = 3 SE +/- 0.72, N = 3 SE +/- 11.46, N = 3 2712.79 2746.26 2755.81 2785.40 MIN: 2634.02 MIN: 2668.84 MIN: 2681.59 MIN: 2704.42 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 1000 2000 3000 4000 5000 SE +/- 3.50, N = 3 SE +/- 4.18, N = 3 SE +/- 2.95, N = 3 SE +/- 19.23, N = 3 4693.82 4750.44 4748.24 4822.05 MIN: 4566.24 MIN: 4617.15 MIN: 4623.59 MIN: 4660.46 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 600 1200 1800 2400 3000 SE +/- 5.07, N = 3 SE +/- 2.90, N = 3 SE +/- 2.26, N = 3 SE +/- 2.56, N = 3 2710.16 2750.23 2758.35 2776.61 MIN: 2633.06 MIN: 2673.4 MIN: 2683.59 MIN: 2704.96 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 1 2 3 4 1.0333 2.0666 3.0999 4.1332 5.1665 SE +/- 0.01052, N = 3 SE +/- 0.00822, N = 3 SE +/- 0.00804, N = 3 SE +/- 0.01165, N = 3 4.43829 4.59147 4.59240 4.58797 MIN: 4.36 MIN: 4.52 MIN: 4.52 MIN: 4.52 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 4 1000 2000 3000 4000 5000 SE +/- 5.69, N = 3 SE +/- 8.86, N = 3 SE +/- 5.69, N = 3 SE +/- 5.35, N = 3 4688.65 4745.82 4750.15 4784.78 MIN: 4561.77 MIN: 4614.86 MIN: 4624.59 MIN: 4663.88 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 4 600 1200 1800 2400 3000 SE +/- 1.04, N = 3 SE +/- 2.48, N = 3 SE +/- 2.36, N = 3 SE +/- 3.78, N = 3 2716.89 2750.92 2756.81 2778.13 MIN: 2639.02 MIN: 2677.79 MIN: 2683.92 MIN: 2704.44 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 1.1167 2.2334 3.3501 4.4668 5.5835 SE +/- 0.01510, N = 3 SE +/- 0.00833, N = 3 SE +/- 0.01296, N = 3 SE +/- 0.01510, N = 3 4.95866 4.96304 4.95866 4.96107 MIN: 4.89 MIN: 4.89 MIN: 4.89 MIN: 4.89 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Xcompact3d Incompact3d Input: input.i3d 129 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 129 Cells Per Direction 1 2 3 4 12 24 36 48 60 SE +/- 0.57, N = 3 SE +/- 0.05, N = 3 SE +/- 0.59, N = 3 SE +/- 0.02, N = 3 50.82 52.49 51.30 52.87 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
Xcompact3d Incompact3d Input: input.i3d 192 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 192 Cells Per Direction 1 2 3 4 100 200 300 400 500 SE +/- 0.04, N = 3 SE +/- 1.04, N = 3 SE +/- 0.78, N = 3 SE +/- 1.01, N = 3 455.44 457.69 457.61 459.69 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
Sysbench Test: RAM / Memory OpenBenchmarking.org MiB/sec, More Is Better Sysbench 1.0.20 Test: RAM / Memory 1 2 3 4 3K 6K 9K 12K 15K SE +/- 61.19, N = 3 SE +/- 109.75, N = 3 SE +/- 133.79, N = 3 SE +/- 53.34, N = 3 12977.84 12774.42 13109.86 13237.06 1. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm
Sysbench Test: CPU OpenBenchmarking.org Events Per Second, More Is Better Sysbench 1.0.20 Test: CPU 1 2 3 4 2K 4K 6K 8K 10K SE +/- 0.76, N = 3 SE +/- 111.87, N = 4 SE +/- 2.08, N = 3 SE +/- 2.96, N = 3 8221.43 8047.96 8211.92 8218.83 1. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm
AOM AV1 Encoder Mode: Speed 0 Two-Pass OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.1-rc Encoder Mode: Speed 0 Two-Pass 1 2 3 4 0.0428 0.0856 0.1284 0.1712 0.214 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.19 0.19 0.19 0.19 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 4 Two-Pass OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.1-rc Encoder Mode: Speed 4 Two-Pass 1 2 3 4 1.0755 2.151 3.2265 4.302 5.3775 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 4.78 4.76 4.76 4.76 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 6 Realtime OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.1-rc Encoder Mode: Speed 6 Realtime 1 2 3 4 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 16.36 16.28 16.28 16.24 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 6 Two-Pass OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.1-rc Encoder Mode: Speed 6 Two-Pass 1 2 3 4 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 13.23 13.21 13.21 13.20 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 8 Realtime OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.1-rc Encoder Mode: Speed 8 Realtime 1 2 3 4 20 40 60 80 100 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.18, N = 3 SE +/- 0.07, N = 3 79.67 78.92 78.85 78.36 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
SVT-VP9 Tuning: VMAF Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: VMAF Optimized - Input: Bosphorus 1080p 1 2 3 4 30 60 90 120 150 SE +/- 0.39, N = 3 SE +/- 0.46, N = 3 SE +/- 0.18, N = 3 SE +/- 0.07, N = 3 114.66 114.39 114.20 113.07 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
SVT-VP9 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p 1 2 3 4 30 60 90 120 150 SE +/- 0.33, N = 3 SE +/- 0.23, N = 3 SE +/- 0.04, N = 3 SE +/- 0.26, N = 3 115.10 115.03 114.40 113.34 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
SVT-VP9 Tuning: Visual Quality Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: Visual Quality Optimized - Input: Bosphorus 1080p 1 2 3 4 20 40 60 80 100 SE +/- 0.12, N = 3 SE +/- 0.09, N = 3 SE +/- 0.19, N = 3 SE +/- 0.12, N = 3 93.79 93.46 93.81 93.23 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
SVT-HEVC Tuning: 1 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 1 - Input: Bosphorus 1080p 1 2 3 4 0.9878 1.9756 2.9634 3.9512 4.939 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 4.39 4.39 4.39 4.38 1. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt
SVT-HEVC Tuning: 7 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 7 - Input: Bosphorus 1080p 1 2 3 4 14 28 42 56 70 SE +/- 0.10, N = 3 SE +/- 0.13, N = 3 SE +/- 0.10, N = 3 SE +/- 0.10, N = 3 63.38 63.31 63.26 63.20 1. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt
SVT-HEVC Tuning: 10 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 10 - Input: Bosphorus 1080p 1 2 3 4 30 60 90 120 150 SE +/- 0.37, N = 3 SE +/- 0.44, N = 3 SE +/- 0.42, N = 3 SE +/- 0.41, N = 3 130.50 130.96 130.43 130.15 1. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt
Timed Mesa Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Mesa Compilation 21.0 Time To Compile 1 2 3 4 20 40 60 80 100 SE +/- 1.18, N = 3 SE +/- 0.04, N = 3 SE +/- 0.11, N = 3 SE +/- 0.08, N = 3 99.74 98.71 98.77 99.02
Timed Node.js Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Node.js Compilation 15.11 Time To Compile 1 2 4 200 400 600 800 1000 SE +/- 0.38, N = 3 SE +/- 0.12, N = 3 SE +/- 0.18, N = 3 890.18 890.01 891.24
simdjson Throughput Test: Kostya OpenBenchmarking.org GB/s, More Is Better simdjson 0.8.2 Throughput Test: Kostya 1 2 3 4 0.5738 1.1476 1.7214 2.2952 2.869 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 2.54 2.55 2.54 2.54 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: LargeRandom OpenBenchmarking.org GB/s, More Is Better simdjson 0.8.2 Throughput Test: LargeRandom 1 2 3 4 0.198 0.396 0.594 0.792 0.99 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.87 0.87 0.88 0.87 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: PartialTweets OpenBenchmarking.org GB/s, More Is Better simdjson 0.8.2 Throughput Test: PartialTweets 1 2 3 4 0.81 1.62 2.43 3.24 4.05 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 3.60 3.60 3.60 3.60 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: DistinctUserID OpenBenchmarking.org GB/s, More Is Better simdjson 0.8.2 Throughput Test: DistinctUserID 1 2 3 4 0.8415 1.683 2.5245 3.366 4.2075 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 3.74 3.74 3.74 3.73 1. (CXX) g++ options: -O3 -pthread
Phoronix Test Suite v10.8.5