xeon platinum 8380 january 2 x Intel Xeon Platinum 8380 testing with a Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS) and ASPEED on Ubuntu 22.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2301068-NE-XEONPLATI59&rdt&grs .
xeon platinum 8380 january Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Vulkan Compiler File-System Screen Resolution a b c 2 x Intel Xeon Platinum 8380 @ 3.40GHz (80 Cores / 160 Threads) Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS) Intel Device 0998 512GB 3841GB Micron_9300_MTFDHAL3T8TDP ASPEED VE228 2 x Intel X710 for 10GBASE-T + 2 x Intel E810-C for QSFP Ubuntu 22.04 5.15.0-47-generic (x86_64) GNOME Shell 42.4 X Server 1.21.1.3 1.2.204 GCC 11.2.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate performance (EPP: performance) - CPU Microcode: 0xd000375 Python Details - Python 3.10.6 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
xeon platinum 8380 january cockroach: KV, 95% Reads - 512 cockroach: KV, 95% Reads - 256 cockroach: KV, 60% Reads - 256 cockroach: KV, 95% Reads - 1024 cockroach: KV, 50% Reads - 256 cockroach: KV, 50% Reads - 512 cockroach: KV, 60% Reads - 1024 cockroach: KV, 50% Reads - 128 cockroach: KV, 95% Reads - 128 kvazaar: Bosphorus 1080p - Super Fast cockroach: MoVR - 512 onednn: Recurrent Neural Network Training - f32 - CPU cockroach: KV, 50% Reads - 1024 onednn: Recurrent Neural Network Inference - u8s8f32 - CPU cockroach: KV, 10% Reads - 1024 onednn: IP Shapes 3D - f32 - CPU cockroach: KV, 60% Reads - 128 numenta-nab: KNN CAD cockroach: KV, 10% Reads - 512 cockroach: KV, 10% Reads - 256 kvazaar: Bosphorus 1080p - Ultra Fast onednn: IP Shapes 3D - bf16bf16bf16 - CPU openvino: Machine Translation EN To DE FP16 - CPU cockroach: KV, 10% Reads - 128 openvino: Machine Translation EN To DE FP16 - CPU uvg266: Bosphorus 4K - Ultra Fast build-linux-kernel: defconfig kvazaar: Bosphorus 4K - Ultra Fast numenta-nab: Windowed Gaussian cockroach: MoVR - 256 onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU numenta-nab: Contextual Anomaly Detector OSE cockroach: MoVR - 128 cockroach: KV, 60% Reads - 512 onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU numenta-nab: Earthgecko Skyline uvg266: Bosphorus 1080p - Ultra Fast kvazaar: Bosphorus 1080p - Very Fast onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU uvg266: Bosphorus 1080p - Very Fast kvazaar: Bosphorus 1080p - Slow onednn: Deconvolution Batch shapes_1d - f32 - CPU uvg266: Bosphorus 1080p - Super Fast kvazaar: Bosphorus 4K - Super Fast openvino: Person Detection FP16 - CPU cockroach: MoVR - 1024 kvazaar: Bosphorus 1080p - Medium uvg266: Bosphorus 4K - Super Fast numenta-nab: Bayesian Changepoint numenta-nab: Relative Entropy openvino: Person Detection FP16 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU brl-cad: VGR Performance Metric uvg266: Bosphorus 4K - Very Fast onednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPU openvkl: vklBenchmark ISPC uvg266: Bosphorus 1080p - Medium build-linux-kernel: allmodconfig openvkl: vklBenchmark Scalar openvino: Age Gender Recognition Retail 0013 FP16 - CPU uvg266: Bosphorus 1080p - Slow kvazaar: Bosphorus 4K - Slow onednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Person Detection FP32 - CPU kvazaar: Bosphorus 4K - Medium openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Person Detection FP32 - CPU openvino: Vehicle Detection FP16-INT8 - CPU uvg266: Bosphorus 4K - Slow onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Face Detection FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Face Detection FP16 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU kvazaar: Bosphorus 4K - Very Fast openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU uvg266: Bosphorus 4K - Medium onednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: IP Shapes 1D - bf16bf16bf16 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 1D - f32 - CPU a b c 124770.8 109396.8 104879 114587.7 103724.9 102949.9 102024.5 101191.6 115235.9 183.22 1004.6 755.374 95936.8 516.697 75439.3 2.02641 96257.5 117.924 81929.6 85831.0 189.93 2.78657 259.74 81692.1 153.57 42.96 26.784 48.64 6.417 978.3 736.729 42.179 1005.3 103167.1 0.201735 0.232192 81.033 151.35 177.06 0.170641 1.43378 147.71 83.36 6.94506 148.14 46.81 13.42 995.0 85.78 43.09 24.128 12.949 2942.97 0.370879 0.875625 3.59352 2460090 41.76 2.08794 921 55.89 238.742 438 1.54 50.56 20.06 3.73078 47089.49 13.09 20.65 9.01 51056.58 8849.01 3007.88 4354.14 14.82 1.15870 9.16 33.52 1726.75 2379.62 429.25 38.1 22.95 92.95 1047.87 2157.13 18.5 44.16 1.41 16.71 11.55603 504.670 755.612 485.542 4.99822 0.566395 2.80808 1.40653 130818.8 132576 112692.4 109363 95363.3 90463.4 89996.2 91347.7 128914 176.22 1051.3 812.452 87162.1 496.945 80303.3 2.05187 97956.1 111.164 78488.4 83364.9 183.2 2.90967 268.94 79813.8 148.47 41.72 27.683 47.98 6.264 946.3 713.105 42.685 1034.4 102734.7 0.197708 0.235762 83.18 152.75 179.03 0.167326 1.40769 145.05 81.99 6.94982 146.01 47.3 13.4 980.1 85.35 43.67 23.827 13.106 2944.02 0.369915 0.866728 3.61988 2441642 41.69 2.09255 915 55.72 240.391 441 1.53 50.64 19.94 3.75292 47321.79 13.15 20.65 9.02 51048.88 8843.72 2999.57 4366.1 14.82 1.16066 9.14 33.57 1729.69 2376.32 429.89 38.13 22.98 92.83 1047.1 2154.54 18.52 44.16 1.41 16.71 12.1656 484.337 835.126 482.719 5.50365 0.697435 3.13206 1.31674 104051.6 114216.8 96996.9 126309.9 108532.6 98677.7 97814.8 103180.8 129287.8 195.56 948.9 736.401 91821.1 474.179 81526.5 2.18165 103471.9 116.755 82853 87542.9 181.68 2.79006 270.5 78523.6 147.66 43.32 27.777 49.69 6.197 956.1 716.123 43.507 1036 105826.5 0.196388 0.2385 81.082 149.16 180.91 0.170187 1.40601 146.66 83.49 7.07135 148.56 47.58 13.21 982.2 84.56 43.63 23.823 12.975 2977.64 0.374184 0.870342 3.58534 2462204 42.04 2.10535 922 55.5 239.632 441 1.53 50.32 20.06 3.73148 47285.92 13.14 20.57 8.99 51211.85 8870.43 3005.19 4360.7 14.78 1.16132 9.15 33.5 1726.31 2380.57 429.47 38.08 22.95 92.93 1048.4 2156.73 18.5 44.19 1.41 16.71 8.11091 463.131 766.996 499.922 5.76784 0.634303 2.13509 1.1324 OpenBenchmarking.org
CockroachDB Workload: KV, 95% Reads - Concurrency: 512 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 95% Reads - Concurrency: 512 a b c 30K 60K 90K 120K 150K 124770.8 130818.8 104051.6
CockroachDB Workload: KV, 95% Reads - Concurrency: 256 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 95% Reads - Concurrency: 256 a b c 30K 60K 90K 120K 150K 109396.8 132576.0 114216.8
CockroachDB Workload: KV, 60% Reads - Concurrency: 256 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 60% Reads - Concurrency: 256 a b c 20K 40K 60K 80K 100K 104879.0 112692.4 96996.9
CockroachDB Workload: KV, 95% Reads - Concurrency: 1024 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 95% Reads - Concurrency: 1024 a b c 30K 60K 90K 120K 150K 114587.7 109363.0 126309.9
CockroachDB Workload: KV, 50% Reads - Concurrency: 256 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 50% Reads - Concurrency: 256 a b c 20K 40K 60K 80K 100K 103724.9 95363.3 108532.6
CockroachDB Workload: KV, 50% Reads - Concurrency: 512 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 50% Reads - Concurrency: 512 a b c 20K 40K 60K 80K 100K 102949.9 90463.4 98677.7
CockroachDB Workload: KV, 60% Reads - Concurrency: 1024 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 60% Reads - Concurrency: 1024 a b c 20K 40K 60K 80K 100K 102024.5 89996.2 97814.8
CockroachDB Workload: KV, 50% Reads - Concurrency: 128 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 50% Reads - Concurrency: 128 a b c 20K 40K 60K 80K 100K 101191.6 91347.7 103180.8
CockroachDB Workload: KV, 95% Reads - Concurrency: 128 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 95% Reads - Concurrency: 128 a b c 30K 60K 90K 120K 150K 115235.9 128914.0 129287.8
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Super Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 1080p - Video Preset: Super Fast a b c 40 80 120 160 200 SE +/- 3.56, N = 3 183.22 176.22 195.56 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
CockroachDB Workload: MoVR - Concurrency: 512 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: MoVR - Concurrency: 512 a b c 200 400 600 800 1000 SE +/- 12.10, N = 3 1004.6 1051.3 948.9
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU a b c 200 400 600 800 1000 SE +/- 5.99, N = 3 755.37 812.45 736.40 MIN: 720.26 MIN: 775.85 MIN: 711.54 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
CockroachDB Workload: KV, 50% Reads - Concurrency: 1024 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 50% Reads - Concurrency: 1024 a b c 20K 40K 60K 80K 100K 95936.8 87162.1 91821.1
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU a b c 110 220 330 440 550 SE +/- 15.41, N = 3 516.70 496.95 474.18 MIN: 470.69 MIN: 480.38 MIN: 461.99 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
CockroachDB Workload: KV, 10% Reads - Concurrency: 1024 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 10% Reads - Concurrency: 1024 a b c 20K 40K 60K 80K 100K 75439.3 80303.3 81526.5
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU a b c 0.4909 0.9818 1.4727 1.9636 2.4545 SE +/- 0.02958, N = 3 2.02641 2.05187 2.18165 MIN: 1.82 MIN: 1.87 MIN: 1.94 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
CockroachDB Workload: KV, 60% Reads - Concurrency: 128 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 60% Reads - Concurrency: 128 a b c 20K 40K 60K 80K 100K 96257.5 97956.1 103471.9
Numenta Anomaly Benchmark Detector: KNN CAD OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: KNN CAD a b c 30 60 90 120 150 117.92 111.16 116.76
CockroachDB Workload: KV, 10% Reads - Concurrency: 512 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 10% Reads - Concurrency: 512 a b c 20K 40K 60K 80K 100K 81929.6 78488.4 82853.0
CockroachDB Workload: KV, 10% Reads - Concurrency: 256 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 10% Reads - Concurrency: 256 a b c 20K 40K 60K 80K 100K SE +/- 365.94, N = 3 85831.0 83364.9 87542.9
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast a b c 40 80 120 160 200 SE +/- 2.47, N = 3 189.93 183.20 181.68 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
oneDNN Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU a b c 0.6547 1.3094 1.9641 2.6188 3.2735 SE +/- 0.06218, N = 3 2.78657 2.90967 2.79006 MIN: 2.04 MIN: 2.19 MIN: 2.05 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Machine Translation EN To DE FP16 - Device: CPU a b c 60 120 180 240 300 259.74 268.94 270.50 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
CockroachDB Workload: KV, 10% Reads - Concurrency: 128 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 10% Reads - Concurrency: 128 a b c 20K 40K 60K 80K 100K SE +/- 1335.25, N = 3 81692.1 79813.8 78523.6
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Machine Translation EN To DE FP16 - Device: CPU a b c 30 60 90 120 150 153.57 148.47 147.66 MIN: 64.36 / MAX: 1093.49 MIN: 80.24 / MAX: 1097.3 MIN: 132.11 / MAX: 1013.19 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
uvg266 Video Input: Bosphorus 4K - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Ultra Fast a b c 10 20 30 40 50 SE +/- 0.62, N = 3 42.96 41.72 43.32
Timed Linux Kernel Compilation Build: defconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.1 Build: defconfig a b c 7 14 21 28 35 SE +/- 0.48, N = 3 26.78 27.68 27.78
Kvazaar Video Input: Bosphorus 4K - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Ultra Fast a b c 11 22 33 44 55 SE +/- 0.33, N = 3 48.64 47.98 49.69 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Numenta Anomaly Benchmark Detector: Windowed Gaussian OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Windowed Gaussian a b c 2 4 6 8 10 6.417 6.264 6.197
CockroachDB Workload: MoVR - Concurrency: 256 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: MoVR - Concurrency: 256 a b c 200 400 600 800 1000 SE +/- 6.48, N = 3 978.3 946.3 956.1
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU a b c 160 320 480 640 800 SE +/- 13.60, N = 3 736.73 713.11 716.12 MIN: 693.29 MIN: 689.07 MIN: 689.45 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Numenta Anomaly Benchmark Detector: Contextual Anomaly Detector OSE OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Contextual Anomaly Detector OSE a b c 10 20 30 40 50 42.18 42.69 43.51
CockroachDB Workload: MoVR - Concurrency: 128 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: MoVR - Concurrency: 128 a b c 200 400 600 800 1000 SE +/- 26.40, N = 3 1005.3 1034.4 1036.0
CockroachDB Workload: KV, 60% Reads - Concurrency: 512 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 60% Reads - Concurrency: 512 a b c 20K 40K 60K 80K 100K 103167.1 102734.7 105826.5
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU a b c 0.0454 0.0908 0.1362 0.1816 0.227 SE +/- 0.000867, N = 3 0.201735 0.197708 0.196388 MIN: 0.19 MIN: 0.19 MIN: 0.18 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU a b c 0.0537 0.1074 0.1611 0.2148 0.2685 SE +/- 0.001872, N = 3 0.232192 0.235762 0.238500 MIN: 0.21 MIN: 0.22 MIN: 0.22 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Numenta Anomaly Benchmark Detector: Earthgecko Skyline OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Earthgecko Skyline a b c 20 40 60 80 100 81.03 83.18 81.08
uvg266 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast a b c 30 60 90 120 150 SE +/- 1.05, N = 3 151.35 152.75 149.16
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 1080p - Video Preset: Very Fast a b c 40 80 120 160 200 SE +/- 1.82, N = 3 177.06 179.03 180.91 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU a b c 0.0384 0.0768 0.1152 0.1536 0.192 SE +/- 0.002111, N = 3 0.170641 0.167326 0.170187 MIN: 0.15 MIN: 0.15 MIN: 0.15 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU a b c 0.3226 0.6452 0.9678 1.2904 1.613 SE +/- 0.00817, N = 3 1.43378 1.40769 1.40601 MIN: 1.28 MIN: 1.23 MIN: 1.29 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
uvg266 Video Input: Bosphorus 1080p - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Very Fast a b c 30 60 90 120 150 SE +/- 0.68, N = 3 147.71 145.05 146.66
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Slow OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 1080p - Video Preset: Slow a b c 20 40 60 80 100 SE +/- 0.58, N = 3 83.36 81.99 83.49 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU a b c 2 4 6 8 10 SE +/- 0.04586, N = 3 6.94506 6.94982 7.07135 MIN: 6.34 MIN: 6.35 MIN: 6.32 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
uvg266 Video Input: Bosphorus 1080p - Video Preset: Super Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Super Fast a b c 30 60 90 120 150 SE +/- 1.19, N = 3 148.14 146.01 148.56
Kvazaar Video Input: Bosphorus 4K - Video Preset: Super Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Super Fast a b c 11 22 33 44 55 SE +/- 0.86, N = 3 46.81 47.30 47.58 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Detection FP16 - Device: CPU a b c 3 6 9 12 15 13.42 13.40 13.21 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
CockroachDB Workload: MoVR - Concurrency: 1024 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: MoVR - Concurrency: 1024 a b c 200 400 600 800 1000 SE +/- 15.13, N = 3 995.0 980.1 982.2
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 1080p - Video Preset: Medium a b c 20 40 60 80 100 SE +/- 0.19, N = 3 85.78 85.35 84.56 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
uvg266 Video Input: Bosphorus 4K - Video Preset: Super Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Super Fast a b c 10 20 30 40 50 SE +/- 0.70, N = 3 43.09 43.67 43.63
Numenta Anomaly Benchmark Detector: Bayesian Changepoint OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Bayesian Changepoint a b c 6 12 18 24 30 24.13 23.83 23.82
Numenta Anomaly Benchmark Detector: Relative Entropy OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Relative Entropy a b c 3 6 9 12 15 12.95 13.11 12.98
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Detection FP16 - Device: CPU a b c 600 1200 1800 2400 3000 2942.97 2944.02 2977.64 MIN: 1578.88 / MAX: 3433.68 MIN: 1597.94 / MAX: 3616.23 MIN: 2273.94 / MAX: 3469.82 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU a b c 0.0842 0.1684 0.2526 0.3368 0.421 SE +/- 0.002095, N = 3 0.370879 0.369915 0.374184 MIN: 0.33 MIN: 0.33 MIN: 0.34 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU a b c 0.197 0.394 0.591 0.788 0.985 SE +/- 0.002166, N = 3 0.875625 0.866728 0.870342 MIN: 0.83 MIN: 0.83 MIN: 0.83 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU a b c 0.8145 1.629 2.4435 3.258 4.0725 SE +/- 0.00722, N = 3 3.59352 3.61988 3.58534 MIN: 3.52 MIN: 3.53 MIN: 3.51 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.34 VGR Performance Metric a b c 500K 1000K 1500K 2000K 2500K 2460090 2441642 2462204 1. (CXX) g++ options: -std=c++14 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -ltcl8.6 -lregex_brl -lz_brl -lnetpbm -ldl -lm -ltk8.6
uvg266 Video Input: Bosphorus 4K - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Very Fast a b c 10 20 30 40 50 SE +/- 0.54, N = 3 41.76 41.69 42.04
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU a b c 0.4737 0.9474 1.4211 1.8948 2.3685 SE +/- 0.00117, N = 3 2.08794 2.09255 2.10535 MIN: 2.03 MIN: 2.03 MIN: 2.03 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenVKL Benchmark: vklBenchmark ISPC OpenBenchmarking.org Items / Sec, More Is Better OpenVKL 1.3.1 Benchmark: vklBenchmark ISPC a b c 200 400 600 800 1000 SE +/- 3.06, N = 3 921 915 922 MIN: 140 / MAX: 7539 MIN: 141 / MAX: 7348 MIN: 141 / MAX: 7376
uvg266 Video Input: Bosphorus 1080p - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Medium a b c 13 26 39 52 65 SE +/- 0.04, N = 3 55.89 55.72 55.50
Timed Linux Kernel Compilation Build: allmodconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.1 Build: allmodconfig a b c 50 100 150 200 250 SE +/- 0.38, N = 3 238.74 240.39 239.63
OpenVKL Benchmark: vklBenchmark Scalar OpenBenchmarking.org Items / Sec, More Is Better OpenVKL 1.3.1 Benchmark: vklBenchmark Scalar a b c 100 200 300 400 500 SE +/- 0.58, N = 3 438 441 441 MIN: 53 / MAX: 5407 MIN: 54 / MAX: 5443 MIN: 54 / MAX: 5447
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU a b c 0.3465 0.693 1.0395 1.386 1.7325 1.54 1.53 1.53 MIN: 0.56 / MAX: 38.51 MIN: 0.52 / MAX: 40.48 MIN: 0.53 / MAX: 26.92 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
uvg266 Video Input: Bosphorus 1080p - Video Preset: Slow OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Slow a b c 11 22 33 44 55 SE +/- 0.11, N = 3 50.56 50.64 50.32
Kvazaar Video Input: Bosphorus 4K - Video Preset: Slow OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Slow a b c 5 10 15 20 25 SE +/- 0.02, N = 3 20.06 19.94 20.06 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU a b c 0.8444 1.6888 2.5332 3.3776 4.222 SE +/- 0.00631, N = 3 3.73078 3.75292 3.73148 MIN: 3.51 MIN: 3.52 MIN: 3.51 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU a b c 10K 20K 30K 40K 50K 47089.49 47321.79 47285.92 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Detection FP32 - Device: CPU a b c 3 6 9 12 15 13.09 13.15 13.14 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
Kvazaar Video Input: Bosphorus 4K - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Medium a b c 5 10 15 20 25 SE +/- 0.02, N = 3 20.65 20.65 20.57 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16-INT8 - Device: CPU a b c 3 6 9 12 15 9.01 9.02 8.99 MIN: 5.51 / MAX: 38.85 MIN: 5.09 / MAX: 40.25 MIN: 5.64 / MAX: 39.38 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU a b c 11K 22K 33K 44K 55K 51056.58 51048.88 51211.85 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16-INT8 - Device: CPU a b c 2K 4K 6K 8K 10K 8849.01 8843.72 8870.43 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Detection FP32 - Device: CPU a b c 600 1200 1800 2400 3000 3007.88 2999.57 3005.19 MIN: 1376.35 / MAX: 3537.67 MIN: 1487.96 / MAX: 3477.36 MIN: 1799.02 / MAX: 3762.45 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16-INT8 - Device: CPU a b c 900 1800 2700 3600 4500 4354.14 4366.10 4360.70 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
uvg266 Video Input: Bosphorus 4K - Video Preset: Slow OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Slow a b c 4 8 12 16 20 SE +/- 0.03, N = 3 14.82 14.82 14.78
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU a b c 0.2613 0.5226 0.7839 1.0452 1.3065 SE +/- 0.00272, N = 3 1.15870 1.16066 1.16132 MIN: 1 MIN: 0.98 MIN: 1 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16-INT8 - Device: CPU a b c 3 6 9 12 15 9.16 9.14 9.15 MIN: 6.15 / MAX: 39.84 MIN: 5.01 / MAX: 38.39 MIN: 5.38 / MAX: 41.31 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16 - Device: CPU a b c 8 16 24 32 40 33.52 33.57 33.50 MIN: 13.93 / MAX: 127.81 MIN: 15.75 / MAX: 191.18 MIN: 16.27 / MAX: 150.01 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Face Detection FP16 - Device: CPU a b c 400 800 1200 1600 2000 1726.75 1729.69 1726.31 MIN: 1186.91 / MAX: 3091.57 MIN: 1532.26 / MAX: 2844.15 MIN: 741.15 / MAX: 2959.51 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16 - Device: CPU a b c 500 1000 1500 2000 2500 2379.62 2376.32 2380.57 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Face Detection FP16-INT8 - Device: CPU a b c 90 180 270 360 450 429.25 429.89 429.47 MIN: 206.78 / MAX: 506.89 MIN: 231.69 / MAX: 613.56 MIN: 191.88 / MAX: 529.07 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16 - Device: CPU a b c 9 18 27 36 45 38.10 38.13 38.08 MIN: 27.95 / MAX: 101.78 MIN: 20.38 / MAX: 106.36 MIN: 26.81 / MAX: 105.35 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Face Detection FP16 - Device: CPU a b c 6 12 18 24 30 22.95 22.98 22.95 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Face Detection FP16-INT8 - Device: CPU a b c 20 40 60 80 100 92.95 92.83 92.93 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16 - Device: CPU a b c 200 400 600 800 1000 1047.87 1047.10 1048.40 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Vehicle Bike Detection FP16 - Device: CPU a b c 500 1000 1500 2000 2500 2157.13 2154.54 2156.73 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Vehicle Bike Detection FP16 - Device: CPU a b c 5 10 15 20 25 18.50 18.52 18.50 MIN: 13.23 / MAX: 51.65 MIN: 12.08 / MAX: 57.28 MIN: 10.72 / MAX: 47.28 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
Kvazaar Video Input: Bosphorus 4K - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Very Fast a b c 10 20 30 40 50 SE +/- 0.52, N = 3 44.16 44.16 44.19 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU a b c 0.3173 0.6346 0.9519 1.2692 1.5865 1.41 1.41 1.41 MIN: 0.51 / MAX: 40.76 MIN: 0.51 / MAX: 37.08 MIN: 0.49 / MAX: 37.01 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
uvg266 Video Input: Bosphorus 4K - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Medium a b c 4 8 12 16 20 SE +/- 0.02, N = 3 16.71 16.71 16.71
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU a b c 3 6 9 12 15 SE +/- 0.97731, N = 3 11.55603 12.16560 8.11091 MIN: 9.09 MIN: 11.24 MIN: 7.49 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU a b c 110 220 330 440 550 SE +/- 18.31, N = 3 504.67 484.34 463.13 MIN: 472.1 MIN: 472.46 MIN: 450.07 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU a b c 200 400 600 800 1000 SE +/- 36.23, N = 3 755.61 835.13 767.00 MIN: 685.81 MIN: 793.57 MIN: 742.01 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU a b c 110 220 330 440 550 SE +/- 19.24, N = 3 485.54 482.72 499.92 MIN: 439.04 MIN: 465.88 MIN: 485.42 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU a b c 1.2978 2.5956 3.8934 5.1912 6.489 SE +/- 0.28155, N = 3 4.99822 5.50365 5.76784 MIN: 3.64 MIN: 3.65 MIN: 4.21 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU a b c 0.1569 0.3138 0.4707 0.6276 0.7845 SE +/- 0.020877, N = 3 0.566395 0.697435 0.634303 MIN: 0.47 MIN: 0.57 MIN: 0.55 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU a b c 0.7047 1.4094 2.1141 2.8188 3.5235 SE +/- 0.26717, N = 3 2.80808 3.13206 2.13509 MIN: 1.72 MIN: 2.23 MIN: 1.58 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU a b c 0.3165 0.633 0.9495 1.266 1.5825 SE +/- 0.11422, N = 3 1.40653 1.31674 1.13240 MIN: 1.06 MIN: 1.14 MIN: 0.98 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Phoronix Test Suite v10.8.5