2 x Intel Xeon Platinum 8380 testing with a Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS) and ASPEED on Ubuntu 22.04 via the Phoronix Test Suite.
a Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: intel_pstate performance (EPP: performance) - CPU Microcode: 0xd000375Python Notes: Python 3.10.6Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
b c Processor: 2 x Intel Xeon Platinum 8380 @ 3.40GHz (80 Cores / 160 Threads), Motherboard: Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS), Chipset: Intel Device 0998, Memory: 512GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP, Graphics: ASPEED, Monitor: VE228, Network: 2 x Intel X710 for 10GBASE-T + 2 x Intel E810-C for QSFP
OS: Ubuntu 22.04, Kernel: 5.15.0-47-generic (x86_64), Desktop: GNOME Shell 42.4, Display Server: X Server 1.21.1.3, Vulkan: 1.2.204, Compiler: GCC 11.2.0, File-System: ext4, Screen Resolution: 1920x1080
BRL-CAD BRL-CAD is a cross-platform, open-source solid modeling system with built-in benchmark mode. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.34 VGR Performance Metric a b c 500K 1000K 1500K 2000K 2500K 2460090 2441642 2462204 1. (CXX) g++ options: -std=c++14 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -ltcl8.6 -lregex_brl -lz_brl -lnetpbm -ldl -lm -ltk8.6
OpenBenchmarking.org Items / Sec, More Is Better OpenVKL 1.3.1 Benchmark: vklBenchmark ISPC a b c 200 400 600 800 1000 SE +/- 3.06, N = 3 921 915 922 MIN: 140 / MAX: 7539 MIN: 141 / MAX: 7348 MIN: 141 / MAX: 7376
CockroachDB CockroachDB is a cloud-native, distributed SQL database for data intensive applications. This test profile uses a server-less CockroachDB configuration to test various Coackroach workloads on the local host with a single node. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 10% Reads - Concurrency: 256 a b c 20K 40K 60K 80K 100K SE +/- 365.94, N = 3 85831.0 83364.9 87542.9
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU a b c 200 400 600 800 1000 SE +/- 5.99, N = 3 755.37 812.45 736.40 MIN: 720.26 MIN: 775.85 MIN: 711.54 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU a b c 200 400 600 800 1000 SE +/- 36.23, N = 3 755.61 835.13 767.00 MIN: 685.81 MIN: 793.57 MIN: 742.01 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU a b c 160 320 480 640 800 SE +/- 13.60, N = 3 736.73 713.11 716.12 MIN: 693.29 MIN: 689.07 MIN: 689.45 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU a b c 110 220 330 440 550 SE +/- 18.31, N = 3 504.67 484.34 463.13 MIN: 472.1 MIN: 472.46 MIN: 450.07 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU a b c 110 220 330 440 550 SE +/- 19.24, N = 3 485.54 482.72 499.92 MIN: 439.04 MIN: 465.88 MIN: 485.42 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU a b c 110 220 330 440 550 SE +/- 15.41, N = 3 516.70 496.95 474.18 MIN: 470.69 MIN: 480.38 MIN: 461.99 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Numenta Anomaly Benchmark Numenta Anomaly Benchmark (NAB) is a benchmark for evaluating algorithms for anomaly detection in streaming, real-time applications. It is comprised of over 50 labeled real-world and artificial time-series data files plus a novel scoring mechanism designed for real-time applications. This test profile currently measures the time to run various detectors. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: KNN CAD a b c 30 60 90 120 150 117.92 111.16 116.76
Numenta Anomaly Benchmark Numenta Anomaly Benchmark (NAB) is a benchmark for evaluating algorithms for anomaly detection in streaming, real-time applications. It is comprised of over 50 labeled real-world and artificial time-series data files plus a novel scoring mechanism designed for real-time applications. This test profile currently measures the time to run various detectors. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Earthgecko Skyline a b c 20 40 60 80 100 81.03 83.18 81.08
OpenVINO This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Detection FP32 - Device: CPU a b c 600 1200 1800 2400 3000 3007.88 2999.57 3005.19 MIN: 1376.35 / MAX: 3537.67 MIN: 1487.96 / MAX: 3477.36 MIN: 1799.02 / MAX: 3762.45 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Detection FP32 - Device: CPU a b c 3 6 9 12 15 13.09 13.15 13.14 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Detection FP16 - Device: CPU a b c 600 1200 1800 2400 3000 2942.97 2944.02 2977.64 MIN: 1578.88 / MAX: 3433.68 MIN: 1597.94 / MAX: 3616.23 MIN: 2273.94 / MAX: 3469.82 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Detection FP16 - Device: CPU a b c 3 6 9 12 15 13.42 13.40 13.21 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Face Detection FP16 - Device: CPU a b c 400 800 1200 1600 2000 1726.75 1729.69 1726.31 MIN: 1186.91 / MAX: 3091.57 MIN: 1532.26 / MAX: 2844.15 MIN: 741.15 / MAX: 2959.51 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Face Detection FP16 - Device: CPU a b c 6 12 18 24 30 22.95 22.98 22.95 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Face Detection FP16-INT8 - Device: CPU a b c 90 180 270 360 450 429.25 429.89 429.47 MIN: 206.78 / MAX: 506.89 MIN: 231.69 / MAX: 613.56 MIN: 191.88 / MAX: 529.07 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Face Detection FP16-INT8 - Device: CPU a b c 20 40 60 80 100 92.95 92.83 92.93 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Vehicle Bike Detection FP16 - Device: CPU a b c 5 10 15 20 25 18.50 18.52 18.50 MIN: 13.23 / MAX: 51.65 MIN: 12.08 / MAX: 57.28 MIN: 10.72 / MAX: 47.28 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Vehicle Bike Detection FP16 - Device: CPU a b c 500 1000 1500 2000 2500 2157.13 2154.54 2156.73 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Machine Translation EN To DE FP16 - Device: CPU a b c 30 60 90 120 150 153.57 148.47 147.66 MIN: 64.36 / MAX: 1093.49 MIN: 80.24 / MAX: 1097.3 MIN: 132.11 / MAX: 1013.19 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Machine Translation EN To DE FP16 - Device: CPU a b c 60 120 180 240 300 259.74 268.94 270.50 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16 - Device: CPU a b c 9 18 27 36 45 38.10 38.13 38.08 MIN: 27.95 / MAX: 101.78 MIN: 20.38 / MAX: 106.36 MIN: 26.81 / MAX: 105.35 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16 - Device: CPU a b c 200 400 600 800 1000 1047.87 1047.10 1048.40 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16-INT8 - Device: CPU a b c 3 6 9 12 15 9.16 9.14 9.15 MIN: 6.15 / MAX: 39.84 MIN: 5.01 / MAX: 38.39 MIN: 5.38 / MAX: 41.31 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16-INT8 - Device: CPU a b c 900 1800 2700 3600 4500 4354.14 4366.10 4360.70 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16 - Device: CPU a b c 8 16 24 32 40 33.52 33.57 33.50 MIN: 13.93 / MAX: 127.81 MIN: 15.75 / MAX: 191.18 MIN: 16.27 / MAX: 150.01 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16 - Device: CPU a b c 500 1000 1500 2000 2500 2379.62 2376.32 2380.57 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16-INT8 - Device: CPU a b c 3 6 9 12 15 9.01 9.02 8.99 MIN: 5.51 / MAX: 38.85 MIN: 5.09 / MAX: 40.25 MIN: 5.64 / MAX: 39.38 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16-INT8 - Device: CPU a b c 2K 4K 6K 8K 10K 8849.01 8843.72 8870.43 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU a b c 0.3173 0.6346 0.9519 1.2692 1.5865 1.41 1.41 1.41 MIN: 0.51 / MAX: 40.76 MIN: 0.51 / MAX: 37.08 MIN: 0.49 / MAX: 37.01 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU a b c 11K 22K 33K 44K 55K 51056.58 51048.88 51211.85 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU a b c 0.3465 0.693 1.0395 1.386 1.7325 1.54 1.53 1.53 MIN: 0.56 / MAX: 38.51 MIN: 0.52 / MAX: 40.48 MIN: 0.53 / MAX: 26.92 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU a b c 10K 20K 30K 40K 50K 47089.49 47321.79 47285.92 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
Kvazaar This is a test of Kvazaar as a CPU-based H.265/HEVC video encoder written in the C programming language and optimized in Assembly. Kvazaar is the winner of the 2016 ACM Open-Source Software Competition and developed at the Ultra Video Group, Tampere University, Finland. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Slow a b c 5 10 15 20 25 SE +/- 0.02, N = 3 20.06 19.94 20.06 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Medium a b c 5 10 15 20 25 SE +/- 0.02, N = 3 20.65 20.65 20.57 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Numenta Anomaly Benchmark Numenta Anomaly Benchmark (NAB) is a benchmark for evaluating algorithms for anomaly detection in streaming, real-time applications. It is comprised of over 50 labeled real-world and artificial time-series data files plus a novel scoring mechanism designed for real-time applications. This test profile currently measures the time to run various detectors. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Contextual Anomaly Detector OSE a b c 10 20 30 40 50 42.18 42.69 43.51
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU a b c 2 4 6 8 10 SE +/- 0.04586, N = 3 6.94506 6.94982 7.07135 MIN: 6.34 MIN: 6.35 MIN: 6.32 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU a b c 0.8444 1.6888 2.5332 3.3776 4.222 SE +/- 0.00631, N = 3 3.73078 3.75292 3.73148 MIN: 3.51 MIN: 3.52 MIN: 3.51 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU a b c 0.0842 0.1684 0.2526 0.3368 0.421 SE +/- 0.002095, N = 3 0.370879 0.369915 0.374184 MIN: 0.33 MIN: 0.33 MIN: 0.34 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU a b c 0.3165 0.633 0.9495 1.266 1.5825 SE +/- 0.11422, N = 3 1.40653 1.31674 1.13240 MIN: 1.06 MIN: 1.14 MIN: 0.98 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU a b c 1.2978 2.5956 3.8934 5.1912 6.489 SE +/- 0.28155, N = 3 4.99822 5.50365 5.76784 MIN: 3.64 MIN: 3.65 MIN: 4.21 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU a b c 0.7047 1.4094 2.1141 2.8188 3.5235 SE +/- 0.26717, N = 3 2.80808 3.13206 2.13509 MIN: 1.72 MIN: 2.23 MIN: 1.58 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Numenta Anomaly Benchmark Numenta Anomaly Benchmark (NAB) is a benchmark for evaluating algorithms for anomaly detection in streaming, real-time applications. It is comprised of over 50 labeled real-world and artificial time-series data files plus a novel scoring mechanism designed for real-time applications. This test profile currently measures the time to run various detectors. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Bayesian Changepoint a b c 6 12 18 24 30 24.13 23.83 23.82
OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Super Fast a b c 10 20 30 40 50 SE +/- 0.70, N = 3 43.09 43.67 43.63
Kvazaar This is a test of Kvazaar as a CPU-based H.265/HEVC video encoder written in the C programming language and optimized in Assembly. Kvazaar is the winner of the 2016 ACM Open-Source Software Competition and developed at the Ultra Video Group, Tampere University, Finland. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Very Fast a b c 10 20 30 40 50 SE +/- 0.52, N = 3 44.16 44.16 44.19 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Super Fast a b c 11 22 33 44 55 SE +/- 0.86, N = 3 46.81 47.30 47.58 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Ultra Fast a b c 11 22 33 44 55 SE +/- 0.33, N = 3 48.64 47.98 49.69 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU a b c 3 6 9 12 15 SE +/- 0.97731, N = 3 11.55603 12.16560 8.11091 MIN: 9.09 MIN: 11.24 MIN: 7.49 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU a b c 0.0537 0.1074 0.1611 0.2148 0.2685 SE +/- 0.001872, N = 3 0.232192 0.235762 0.238500 MIN: 0.21 MIN: 0.22 MIN: 0.22 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU a b c 0.0384 0.0768 0.1152 0.1536 0.192 SE +/- 0.002111, N = 3 0.170641 0.167326 0.170187 MIN: 0.15 MIN: 0.15 MIN: 0.15 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Medium a b c 13 26 39 52 65 SE +/- 0.04, N = 3 55.89 55.72 55.50
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU a b c 0.4909 0.9818 1.4727 1.9636 2.4545 SE +/- 0.02958, N = 3 2.02641 2.05187 2.18165 MIN: 1.82 MIN: 1.87 MIN: 1.94 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU a b c 0.6547 1.3094 1.9641 2.6188 3.2735 SE +/- 0.06218, N = 3 2.78657 2.90967 2.79006 MIN: 2.04 MIN: 2.19 MIN: 2.05 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU a b c 0.1569 0.3138 0.4707 0.6276 0.7845 SE +/- 0.020877, N = 3 0.566395 0.697435 0.634303 MIN: 0.47 MIN: 0.57 MIN: 0.55 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Numenta Anomaly Benchmark Numenta Anomaly Benchmark (NAB) is a benchmark for evaluating algorithms for anomaly detection in streaming, real-time applications. It is comprised of over 50 labeled real-world and artificial time-series data files plus a novel scoring mechanism designed for real-time applications. This test profile currently measures the time to run various detectors. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Relative Entropy a b c 3 6 9 12 15 12.95 13.11 12.98
Kvazaar This is a test of Kvazaar as a CPU-based H.265/HEVC video encoder written in the C programming language and optimized in Assembly. Kvazaar is the winner of the 2016 ACM Open-Source Software Competition and developed at the Ultra Video Group, Tampere University, Finland. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 1080p - Video Preset: Slow a b c 20 40 60 80 100 SE +/- 0.58, N = 3 83.36 81.99 83.49 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 1080p - Video Preset: Medium a b c 20 40 60 80 100 SE +/- 0.19, N = 3 85.78 85.35 84.56 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU a b c 0.3226 0.6452 0.9678 1.2904 1.613 SE +/- 0.00817, N = 3 1.43378 1.40769 1.40601 MIN: 1.28 MIN: 1.23 MIN: 1.29 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU a b c 0.2613 0.5226 0.7839 1.0452 1.3065 SE +/- 0.00272, N = 3 1.15870 1.16066 1.16132 MIN: 1 MIN: 0.98 MIN: 1 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU a b c 0.4737 0.9474 1.4211 1.8948 2.3685 SE +/- 0.00117, N = 3 2.08794 2.09255 2.10535 MIN: 2.03 MIN: 2.03 MIN: 2.03 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Super Fast a b c 30 60 90 120 150 SE +/- 1.19, N = 3 148.14 146.01 148.56
OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast a b c 30 60 90 120 150 SE +/- 1.05, N = 3 151.35 152.75 149.16
Numenta Anomaly Benchmark Numenta Anomaly Benchmark (NAB) is a benchmark for evaluating algorithms for anomaly detection in streaming, real-time applications. It is comprised of over 50 labeled real-world and artificial time-series data files plus a novel scoring mechanism designed for real-time applications. This test profile currently measures the time to run various detectors. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Windowed Gaussian a b c 2 4 6 8 10 6.417 6.264 6.197
Kvazaar This is a test of Kvazaar as a CPU-based H.265/HEVC video encoder written in the C programming language and optimized in Assembly. Kvazaar is the winner of the 2016 ACM Open-Source Software Competition and developed at the Ultra Video Group, Tampere University, Finland. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 1080p - Video Preset: Very Fast a b c 40 80 120 160 200 SE +/- 1.82, N = 3 177.06 179.03 180.91 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 1080p - Video Preset: Super Fast a b c 40 80 120 160 200 SE +/- 3.56, N = 3 183.22 176.22 195.56 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast a b c 40 80 120 160 200 SE +/- 2.47, N = 3 189.93 183.20 181.68 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU a b c 0.8145 1.629 2.4435 3.258 4.0725 SE +/- 0.00722, N = 3 3.59352 3.61988 3.58534 MIN: 3.52 MIN: 3.53 MIN: 3.51 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU a b c 0.197 0.394 0.591 0.788 0.985 SE +/- 0.002166, N = 3 0.875625 0.866728 0.870342 MIN: 0.83 MIN: 0.83 MIN: 0.83 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU a b c 0.0454 0.0908 0.1362 0.1816 0.227 SE +/- 0.000867, N = 3 0.201735 0.197708 0.196388 MIN: 0.19 MIN: 0.19 MIN: 0.18 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
a Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: intel_pstate performance (EPP: performance) - CPU Microcode: 0xd000375Python Notes: Python 3.10.6Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 5 January 2023 16:13 by user phoronix.
b Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: intel_pstate performance (EPP: performance) - CPU Microcode: 0xd000375Python Notes: Python 3.10.6Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 5 January 2023 20:07 by user phoronix.
c Processor: 2 x Intel Xeon Platinum 8380 @ 3.40GHz (80 Cores / 160 Threads), Motherboard: Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS), Chipset: Intel Device 0998, Memory: 512GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP, Graphics: ASPEED, Monitor: VE228, Network: 2 x Intel X710 for 10GBASE-T + 2 x Intel E810-C for QSFP
OS: Ubuntu 22.04, Kernel: 5.15.0-47-generic (x86_64), Desktop: GNOME Shell 42.4, Display Server: X Server 1.21.1.3, Vulkan: 1.2.204, Compiler: GCC 11.2.0, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: intel_pstate performance (EPP: performance) - CPU Microcode: 0xd000375Python Notes: Python 3.10.6Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 6 January 2023 04:25 by user phoronix.