Intel Xeon Silver 4216 testing with a TYAN S7100AG2NR (V4.02 BIOS) and ASPEED on Debian 12 via the Phoronix Test Suite.
a Kernel Notes: Transparent Huge Pages: alwaysCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-bTRWOB/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-bTRWOB/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x500002cPython Notes: Python 3.11.2Security Notes: gather_data_sampling: Vulnerable: No microcode + itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Vulnerable: Clear buffers attempted no microcode; SMT vulnerable + retbleed: Mitigation of Enhanced IBRS + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Mitigation of TSX disabled
b Processor: Intel Xeon Silver 4216 @ 3.20GHz (16 Cores / 32 Threads), Motherboard: TYAN S7100AG2NR (V4.02 BIOS), Chipset: Intel Sky Lake-E DMI3 Registers, Memory: 46GB, Disk: 240GB Corsair Force MP500, Graphics: ASPEED, Audio: Realtek ALC892, Network: 2 x Intel I350
OS: Debian 12, Kernel: 6.1.0-11-amd64 (x86_64), Display Server: X Server, Compiler: GCC 12.2.0, File-System: ext4, Screen Resolution: 1024x768
new okt OpenBenchmarking.org Phoronix Test Suite Intel Xeon Silver 4216 @ 3.20GHz (16 Cores / 32 Threads) TYAN S7100AG2NR (V4.02 BIOS) Intel Sky Lake-E DMI3 Registers 46GB 240GB Corsair Force MP500 ASPEED Realtek ALC892 2 x Intel I350 Debian 12 6.1.0-11-amd64 (x86_64) X Server GCC 12.2.0 ext4 1024x768 Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Display Server Compiler File-System Screen Resolution New Okt Benchmarks System Logs - Transparent Huge Pages: always - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-bTRWOB/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-bTRWOB/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x500002c - Python 3.11.2 - gather_data_sampling: Vulnerable: No microcode + itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Vulnerable: Clear buffers attempted no microcode; SMT vulnerable + retbleed: Mitigation of Enhanced IBRS + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Mitigation of TSX disabled
a vs. b Comparison Phoronix Test Suite Baseline +21.9% +21.9% +43.8% +43.8% +65.7% +65.7% 87.4% 13% 11.4% 9.4% 6.8% 5.6% 3.4% 3.1% 2% D.B.s - u8s8f32 - CPU R.N.N.I - bf16bf16bf16 - CPU 16.5% R.N.N.T - u8s8f32 - CPU e.G.B.S - 240 IP Shapes 1D - u8s8f32 - CPU IP Shapes 1D - bf16bf16bf16 - CPU 8.1% D.B.s - bf16bf16bf16 - CPU IP Shapes 3D - u8s8f32 - CPU R.N.N.T - bf16bf16bf16 - CPU 6 3.3% RT.ldr_alb_nrm.3840x2160 - CPU-Only e.G.B.S - 1200 3% Preset 12 - Bosphorus 1080p 2.2% e.G.B.S - 2400 oneDNN oneDNN oneDNN easyWave oneDNN oneDNN oneDNN oneDNN oneDNN libavif avifenc Intel Open Image Denoise easyWave SVT-AV1 easyWave a b
new okt openradioss: Chrysler Neon 1M openvkl: vklBenchmarkCPU ISPC openvkl: vklBenchmarkCPU Scalar openradioss: INIVOL and Fluid Structure Interaction Drop Container easywave: e2Asean Grid + BengkuluSept2007 Source - 2400 openradioss: Bird Strike on Windshield openradioss: Rubber O-Ring Seal Installation oidn: RTLightmap.hdr.4096x4096 - CPU-Only openradioss: Bumper Beam avifenc: 0 easywave: e2Asean Grid + BengkuluSept2007 Source - 1200 openradioss: Cell Phone Drop Test quantlib: Multi-Threaded oidn: RT.hdr_alb_nrm.3840x2160 - CPU-Only oidn: RT.ldr_alb_nrm.3840x2160 - CPU-Only avifenc: 2 onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU svt-av1: Preset 4 - Bosphorus 4K onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU embree: Pathtracer ISPC - Asian Dragon Obj embree: Pathtracer ISPC - Crown openvino: Face Detection FP16 - CPU openvino: Face Detection FP16 - CPU embree: Pathtracer - Asian Dragon Obj openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Person Detection FP32 - CPU openvino: Person Detection FP32 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Face Detection Retail FP16 - CPU embree: Pathtracer - Crown embree: Pathtracer ISPC - Asian Dragon embree: Pathtracer - Asian Dragon quantlib: Single-Threaded svt-av1: Preset 4 - Bosphorus 1080p svt-av1: Preset 8 - Bosphorus 4K onednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: IP Shapes 1D - bf16bf16bf16 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU svt-av1: Preset 8 - Bosphorus 1080p avifenc: 6, Lossless onednn: IP Shapes 3D - bf16bf16bf16 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU svt-av1: Preset 13 - Bosphorus 4K easywave: e2Asean Grid + BengkuluSept2007 Source - 240 svt-av1: Preset 12 - Bosphorus 4K avifenc: 6 avifenc: 10, Lossless onednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU svt-av1: Preset 12 - Bosphorus 1080p svt-av1: Preset 13 - Bosphorus 1080p onednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU a b 871.61 275 95 553.97 404.939 312.5 200.86 0.15 187.21 189.696 157.063 116.59 31381.4 0.32 0.32 90.464 4007.84 3614.7 1.938 1797.93 1824.27 11.7626 9.9432 3455.53 2.31 14.4738 935.89 8.53 326.23 24.46 325.69 24.51 99.82 80.07 46.82 170.73 13.62 586.21 13.2117 13.6903 16.0896 2056.3 5.348 29.894 23.779 2.59994 10.5784 1.28255 44.227 13.266 3.12064 1.54511 73.364 7.976 73.226 7.867 7.667 16.2979 6.2591 181.541 180.466 21.9038 1.90079 864.38 274 94 553.87 396.883 311.49 198.69 0.15 188.77 188.625 161.835 115.9 31348.2 0.32 0.33 91.588 3545.8 3495.73 1.95 2094.22 1807.68 11.7573 9.9097 3462.59 2.3 14.4474 925.88 8.59 325.91 24.52 328.78 24.29 100.56 79.5 46.41 172.22 13.6 587.05 13.1081 13.6191 15.9644 2076.5 5.442 30.156 22.2691 1.38733 11.4363 1.17212 44.411 13.362 3.10326 1.46367 73.222 7.159 73.8 8.129 7.708 16.3061 6.14564 177.555 182.353 21.9112 1.90069 OpenBenchmarking.org
OpenRadioss OpenRadioss is an open-source AGPL-licensed finite element solver for dynamic event analysis OpenRadioss is based on Altair Radioss and open-sourced in 2022. This open-source finite element solver is benchmarked with various example models available from https://www.openradioss.org/models/ and https://github.com/OpenRadioss/ModelExchange/tree/main/Examples. This test is currently using a reference OpenRadioss binary build offered via GitHub. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2023.09.15 Model: Chrysler Neon 1M a b 200 400 600 800 1000 871.61 864.38
OpenRadioss OpenRadioss is an open-source AGPL-licensed finite element solver for dynamic event analysis OpenRadioss is based on Altair Radioss and open-sourced in 2022. This open-source finite element solver is benchmarked with various example models available from https://www.openradioss.org/models/ and https://github.com/OpenRadioss/ModelExchange/tree/main/Examples. This test is currently using a reference OpenRadioss binary build offered via GitHub. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2023.09.15 Model: INIVOL and Fluid Structure Interaction Drop Container a b 120 240 360 480 600 553.97 553.87
easyWave The easyWave software allows simulating tsunami generation and propagation in the context of early warning systems. EasyWave supports making use of OpenMP for CPU multi-threading and there are also GPU ports available but not currently incorporated as part of this test profile. The easyWave tsunami generation software is run with one of the example/reference input files for measuring the CPU execution time. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better easyWave r34 Input: e2Asean Grid + BengkuluSept2007 Source - Time: 2400 a b 90 180 270 360 450 404.94 396.88 1. (CXX) g++ options: -O3 -fopenmp
OpenRadioss OpenRadioss is an open-source AGPL-licensed finite element solver for dynamic event analysis OpenRadioss is based on Altair Radioss and open-sourced in 2022. This open-source finite element solver is benchmarked with various example models available from https://www.openradioss.org/models/ and https://github.com/OpenRadioss/ModelExchange/tree/main/Examples. This test is currently using a reference OpenRadioss binary build offered via GitHub. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2023.09.15 Model: Bird Strike on Windshield a b 70 140 210 280 350 312.50 311.49
OpenRadioss OpenRadioss is an open-source AGPL-licensed finite element solver for dynamic event analysis OpenRadioss is based on Altair Radioss and open-sourced in 2022. This open-source finite element solver is benchmarked with various example models available from https://www.openradioss.org/models/ and https://github.com/OpenRadioss/ModelExchange/tree/main/Examples. This test is currently using a reference OpenRadioss binary build offered via GitHub. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2023.09.15 Model: Bumper Beam a b 40 80 120 160 200 187.21 188.77
easyWave The easyWave software allows simulating tsunami generation and propagation in the context of early warning systems. EasyWave supports making use of OpenMP for CPU multi-threading and there are also GPU ports available but not currently incorporated as part of this test profile. The easyWave tsunami generation software is run with one of the example/reference input files for measuring the CPU execution time. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better easyWave r34 Input: e2Asean Grid + BengkuluSept2007 Source - Time: 1200 a b 40 80 120 160 200 157.06 161.84 1. (CXX) g++ options: -O3 -fopenmp
OpenRadioss OpenRadioss is an open-source AGPL-licensed finite element solver for dynamic event analysis OpenRadioss is based on Altair Radioss and open-sourced in 2022. This open-source finite element solver is benchmarked with various example models available from https://www.openradioss.org/models/ and https://github.com/OpenRadioss/ModelExchange/tree/main/Examples. This test is currently using a reference OpenRadioss binary build offered via GitHub. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2023.09.15 Model: Cell Phone Drop Test a b 30 60 90 120 150 116.59 115.90
QuantLib QuantLib is an open-source library/framework around quantitative finance for modeling, trading and risk management scenarios. QuantLib is written in C++ with Boost and its built-in benchmark used reports the QuantLib Benchmark Index benchmark score. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.32 Configuration: Multi-Threaded a b 7K 14K 21K 28K 35K 31381.4 31348.2 1. (CXX) g++ options: -O3 -march=native -fPIE -pie
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU a b 900 1800 2700 3600 4500 4007.84 3545.80 MIN: 3484.39 MIN: 3487.29 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU a b 800 1600 2400 3200 4000 3614.70 3495.73 MIN: 3480.97 MIN: 3479.48 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
SVT-AV1 OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.7 Encoder Mode: Preset 4 - Input: Bosphorus 4K a b 0.4388 0.8776 1.3164 1.7552 2.194 1.938 1.950 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU a b 400 800 1200 1600 2000 1797.93 2094.22 MIN: 1794.61 MIN: 1795.75 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU a b 400 800 1200 1600 2000 1824.27 1807.68 MIN: 1793.75 MIN: 1791.14 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
Embree OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer ISPC - Model: Asian Dragon Obj a b 3 6 9 12 15 11.76 11.76 MIN: 11.72 / MAX: 11.85 MIN: 11.7 / MAX: 11.83
OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer ISPC - Model: Crown a b 3 6 9 12 15 9.9432 9.9097 MIN: 9.85 / MAX: 10.05 MIN: 9.85 / MAX: 10
OpenVINO OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.1 Model: Face Detection FP16 - Device: CPU a b 700 1400 2100 2800 3500 3455.53 3462.59 MIN: 3418.23 / MAX: 3513.1 MIN: 3344.65 / MAX: 3744.53 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.1 Model: Face Detection FP16 - Device: CPU a b 0.5198 1.0396 1.5594 2.0792 2.599 2.31 2.30 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Embree OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer - Model: Asian Dragon Obj a b 4 8 12 16 20 14.47 14.45 MIN: 14.28 / MAX: 14.7 MIN: 14.24 / MAX: 14.67
OpenVINO OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.1 Model: Face Detection FP16-INT8 - Device: CPU a b 200 400 600 800 1000 935.89 925.88 MIN: 814.75 / MAX: 1196.97 MIN: 864.58 / MAX: 1160.82 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.1 Model: Face Detection FP16-INT8 - Device: CPU a b 2 4 6 8 10 8.53 8.59 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.1 Model: Person Detection FP32 - Device: CPU a b 70 140 210 280 350 326.23 325.91 MIN: 277.01 / MAX: 404.38 MIN: 157.96 / MAX: 439.44 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.1 Model: Person Detection FP32 - Device: CPU a b 6 12 18 24 30 24.46 24.52 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.1 Model: Person Detection FP16 - Device: CPU a b 70 140 210 280 350 325.69 328.78 MIN: 196.27 / MAX: 414.63 MIN: 274.17 / MAX: 400.09 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.1 Model: Person Detection FP16 - Device: CPU a b 6 12 18 24 30 24.51 24.29 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.1 Model: Road Segmentation ADAS FP16 - Device: CPU a b 20 40 60 80 100 99.82 100.56 MIN: 81.59 / MAX: 163.3 MIN: 47.57 / MAX: 153.01 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.1 Model: Road Segmentation ADAS FP16 - Device: CPU a b 20 40 60 80 100 80.07 79.50 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.1 Model: Vehicle Detection FP16 - Device: CPU a b 11 22 33 44 55 46.82 46.41 MIN: 19.88 / MAX: 104.95 MIN: 22.28 / MAX: 96.86 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.1 Model: Vehicle Detection FP16 - Device: CPU a b 40 80 120 160 200 170.73 172.22 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.1 Model: Face Detection Retail FP16 - Device: CPU a b 4 8 12 16 20 13.62 13.60 MIN: 7.92 / MAX: 98.57 MIN: 7.95 / MAX: 64.22 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.1 Model: Face Detection Retail FP16 - Device: CPU a b 130 260 390 520 650 586.21 587.05 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Embree OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer - Model: Crown a b 3 6 9 12 15 13.21 13.11 MIN: 13.1 / MAX: 13.43 MIN: 12.98 / MAX: 13.28
OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer ISPC - Model: Asian Dragon a b 4 8 12 16 20 13.69 13.62 MIN: 13.65 / MAX: 13.76 MIN: 13.56 / MAX: 13.69
OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer - Model: Asian Dragon a b 4 8 12 16 20 16.09 15.96 MIN: 15.88 / MAX: 16.35 MIN: 15.75 / MAX: 16.2
QuantLib QuantLib is an open-source library/framework around quantitative finance for modeling, trading and risk management scenarios. QuantLib is written in C++ with Boost and its built-in benchmark used reports the QuantLib Benchmark Index benchmark score. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.32 Configuration: Single-Threaded a b 400 800 1200 1600 2000 2056.3 2076.5 1. (CXX) g++ options: -O3 -march=native -fPIE -pie
SVT-AV1 OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.7 Encoder Mode: Preset 4 - Input: Bosphorus 1080p a b 1.2245 2.449 3.6735 4.898 6.1225 5.348 5.442 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.7 Encoder Mode: Preset 8 - Input: Bosphorus 4K a b 7 14 21 28 35 29.89 30.16 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU a b 6 12 18 24 30 23.78 22.27 MIN: 21.01 MIN: 21.04 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU a b 0.585 1.17 1.755 2.34 2.925 2.59994 1.38733 MIN: 1.31 MIN: 1.31 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU a b 3 6 9 12 15 10.58 11.44 MIN: 9.94 MIN: 9.73 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU a b 0.2886 0.5772 0.8658 1.1544 1.443 1.28255 1.17212 MIN: 1.1 MIN: 1.08 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
SVT-AV1 OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.7 Encoder Mode: Preset 8 - Input: Bosphorus 1080p a b 10 20 30 40 50 44.23 44.41 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU a b 0.7021 1.4042 2.1063 2.8084 3.5105 3.12064 3.10326 MIN: 2.99 MIN: 2.97 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU a b 0.3476 0.6952 1.0428 1.3904 1.738 1.54511 1.46367 MIN: 1.43 MIN: 1.42 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
SVT-AV1 OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.7 Encoder Mode: Preset 13 - Input: Bosphorus 4K a b 16 32 48 64 80 73.36 73.22 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
easyWave The easyWave software allows simulating tsunami generation and propagation in the context of early warning systems. EasyWave supports making use of OpenMP for CPU multi-threading and there are also GPU ports available but not currently incorporated as part of this test profile. The easyWave tsunami generation software is run with one of the example/reference input files for measuring the CPU execution time. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better easyWave r34 Input: e2Asean Grid + BengkuluSept2007 Source - Time: 240 a b 2 4 6 8 10 7.976 7.159 1. (CXX) g++ options: -O3 -fopenmp
SVT-AV1 OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.7 Encoder Mode: Preset 12 - Input: Bosphorus 4K a b 16 32 48 64 80 73.23 73.80 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU a b 4 8 12 16 20 16.30 16.31 MIN: 16.28 MIN: 16.28 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU a b 2 4 6 8 10 6.25910 6.14564 MIN: 6.21 MIN: 6.1 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
SVT-AV1 OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.7 Encoder Mode: Preset 12 - Input: Bosphorus 1080p a b 40 80 120 160 200 181.54 177.56 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.7 Encoder Mode: Preset 13 - Input: Bosphorus 1080p a b 40 80 120 160 200 180.47 182.35 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU a b 5 10 15 20 25 21.90 21.91 MIN: 21.78 MIN: 21.8 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU a b 0.4277 0.8554 1.2831 1.7108 2.1385 1.90079 1.90069 MIN: 1.89 MIN: 1.89 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
a Kernel Notes: Transparent Huge Pages: alwaysCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-bTRWOB/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-bTRWOB/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x500002cPython Notes: Python 3.11.2Security Notes: gather_data_sampling: Vulnerable: No microcode + itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Vulnerable: Clear buffers attempted no microcode; SMT vulnerable + retbleed: Mitigation of Enhanced IBRS + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Mitigation of TSX disabled
Testing initiated at 24 October 2023 19:09 by user phoronix.
b Processor: Intel Xeon Silver 4216 @ 3.20GHz (16 Cores / 32 Threads), Motherboard: TYAN S7100AG2NR (V4.02 BIOS), Chipset: Intel Sky Lake-E DMI3 Registers, Memory: 46GB, Disk: 240GB Corsair Force MP500, Graphics: ASPEED, Audio: Realtek ALC892, Network: 2 x Intel I350
OS: Debian 12, Kernel: 6.1.0-11-amd64 (x86_64), Display Server: X Server, Compiler: GCC 12.2.0, File-System: ext4, Screen Resolution: 1024x768
Kernel Notes: Transparent Huge Pages: alwaysCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-bTRWOB/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-bTRWOB/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x500002cPython Notes: Python 3.11.2Security Notes: gather_data_sampling: Vulnerable: No microcode + itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Vulnerable: Clear buffers attempted no microcode; SMT vulnerable + retbleed: Mitigation of Enhanced IBRS + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Mitigation of TSX disabled
Testing initiated at 24 October 2023 21:08 by user phoronix.