AMD EPYC 7513 32-Core testing with a Supermicro H12SSL-i v1.02 (2.4 BIOS) and astdrmfb on AlmaLinux 9.1 via the Phoronix Test Suite.
b Kernel Notes: Transparent Huge Pages: alwaysCompiler Notes: --build=x86_64-redhat-linux --disable-libunwind-exceptions --enable-__cxa_atexit --enable-bootstrap --enable-cet --enable-checking=release --enable-gnu-indirect-function --enable-gnu-unique-object --enable-host-bind-now --enable-host-pie --enable-initfini-array --enable-languages=c,c++,fortran,lto --enable-link-serialization=1 --enable-multilib --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-arch_32=x86-64 --with-arch_64=x86-64-v2 --with-build-config=bootstrap-lto --with-gcc-major-version-only --with-linker-hash-style=gnu --with-tune=generic --without-cuda-driver --without-islProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa001173Python Notes: Python 3.9.14Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
c d e Processor: AMD EPYC 7513 32-Core @ 2.60GHz (32 Cores / 64 Threads), Motherboard: Supermicro H12SSL-i v1.02 (2.4 BIOS), Memory: 8 x 64 GB DDR4-3200MT/s Samsung M393A8G40AB2-CWE, Disk: 2 x 1920GB SAMSUNG MZQL21T9HCJR-00A07, Graphics: astdrmfb
OS: AlmaLinux 9.1, Kernel: 5.14.0-162.12.1.el9_1.x86_64 (x86_64), Compiler: GCC 11.3.1 20220421, File-System: ext4, Screen Resolution: 1024x768
NAMD NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD was developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org days/ns, Fewer Is Better NAMD 2.14 ATPase Simulation - 327,506 Atoms e d c b 0.1608 0.3216 0.4824 0.6432 0.804 0.71234 0.71402 0.71454 0.71456
OpenBenchmarking.org FPS, More Is Better dav1d 1.1 Video Input: Summer Nature 4K e d c b 70 140 210 280 350 323.81 339.17 339.42 339.91 1. (CC) gcc options: -pthread -lm
OpenBenchmarking.org FPS, More Is Better dav1d 1.1 Video Input: Summer Nature 1080p e d c b 200 400 600 800 1000 939.41 933.94 941.16 933.33 1. (CC) gcc options: -pthread -lm
OpenBenchmarking.org FPS, More Is Better dav1d 1.1 Video Input: Chimera 1080p 10-bit e d c b 120 240 360 480 600 539.70 553.84 556.07 553.58 1. (CC) gcc options: -pthread -lm
Embree Intel Embree is a collection of high-performance ray-tracing kernels for execution on CPUs (and GPUs via SYCL) and supporting instruction sets such as SSE, AVX, AVX2, and AVX-512. Embree also supports making use of the Intel SPMD Program Compiler (ISPC). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0 Binary: Pathtracer - Model: Crown e d c b 8 16 24 32 40 34.44 34.33 34.50 34.52 MIN: 34.08 / MAX: 35.11 MIN: 33.97 / MAX: 35.05 MIN: 34.14 / MAX: 35.22 MIN: 34.17 / MAX: 35.11
OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0 Binary: Pathtracer ISPC - Model: Crown e d c b 8 16 24 32 40 33.28 32.91 32.96 33.23 MIN: 32.52 / MAX: 34.05 MIN: 32.03 / MAX: 33.8 MIN: 32.2 / MAX: 33.81 MIN: 32.54 / MAX: 33.94
OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0 Binary: Pathtracer - Model: Asian Dragon e d c b 9 18 27 36 45 40.28 40.24 40.28 40.37 MIN: 40.08 / MAX: 40.9 MIN: 40.05 / MAX: 40.68 MIN: 40.07 / MAX: 40.65 MIN: 40.18 / MAX: 40.77
OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0 Binary: Pathtracer ISPC - Model: Asian Dragon e d c b 9 18 27 36 45 38.52 38.48 38.57 38.37 MIN: 38.34 / MAX: 38.99 MIN: 38.27 / MAX: 38.85 MIN: 38.34 / MAX: 39.09 MIN: 38.16 / MAX: 38.73
Kvazaar This is a test of Kvazaar as a CPU-based H.265/HEVC video encoder written in the C programming language and optimized in Assembly. Kvazaar is the winner of the 2016 ACM Open-Source Software Competition and developed at the Ultra Video Group, Tampere University, Finland. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Slow e d c b 5 10 15 20 25 19.81 19.73 19.76 19.81 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Medium e d c b 5 10 15 20 25 20.34 20.28 20.31 20.36 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 1080p - Video Preset: Slow e d c b 14 28 42 56 70 61.53 61.49 61.40 61.76 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 1080p - Video Preset: Medium e d c b 14 28 42 56 70 64.23 64.03 63.80 63.80 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Very Fast e d c b 10 20 30 40 50 42.29 42.37 42.26 42.43 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Super Fast e d c b 12 24 36 48 60 52.91 52.82 52.28 52.88 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Ultra Fast e d c b 14 28 42 56 70 62.63 62.28 62.38 61.65 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 1080p - Video Preset: Very Fast e d c b 30 60 90 120 150 139.11 139.51 138.96 139.07 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 1080p - Video Preset: Super Fast e d c b 40 80 120 160 200 188.47 188.15 187.03 187.41 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast e d c b 50 100 150 200 250 227.36 227.70 227.91 226.52 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
SVT-AV1 This is a benchmark of the SVT-AV1 open-source video encoder/decoder. SVT-AV1 was originally developed by Intel as part of their Open Visual Cloud / Scalable Video Technology (SVT). Development of SVT-AV1 has since moved to the Alliance for Open Media as part of upstream AV1 development. SVT-AV1 is a CPU-based multi-threaded video encoder for the AV1 video format with a sample YUV video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.4 Encoder Mode: Preset 4 - Input: Bosphorus 4K e d c b 0.8341 1.6682 2.5023 3.3364 4.1705 3.624 3.702 3.673 3.707 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.4 Encoder Mode: Preset 8 - Input: Bosphorus 4K e d c b 15 30 45 60 75 65.44 66.18 66.89 66.11 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.4 Encoder Mode: Preset 12 - Input: Bosphorus 4K e d c b 50 100 150 200 250 230.36 231.21 230.23 232.92 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.4 Encoder Mode: Preset 13 - Input: Bosphorus 4K e d c b 50 100 150 200 250 213.28 210.40 209.47 210.05 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.4 Encoder Mode: Preset 4 - Input: Bosphorus 1080p e d c b 2 4 6 8 10 8.878 8.725 8.569 8.559 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.4 Encoder Mode: Preset 8 - Input: Bosphorus 1080p e d c b 30 60 90 120 150 148.00 148.15 146.24 145.56 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.4 Encoder Mode: Preset 12 - Input: Bosphorus 1080p e d c b 140 280 420 560 700 630.68 642.19 635.08 650.84 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.4 Encoder Mode: Preset 13 - Input: Bosphorus 1080p e d c b 130 260 390 520 650 601.65 608.46 609.46 615.56 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Super Fast e d c b 30 60 90 120 150 138.94 138.44 139.05 138.53
OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast e d c b 30 60 90 120 150 154.00 154.12 153.27 154.13
OpenBenchmarking.org Frames Per Second, More Is Better VP9 libvpx Encoding 1.13 Speed: Speed 5 - Input: Bosphorus 4K e d c b 4 8 12 16 20 16.02 16.19 16.04 15.94 1. (CXX) g++ options: -m64 -lm -lpthread -O3 -fPIC -U_FORTIFY_SOURCE -std=gnu++11
OpenBenchmarking.org Frames Per Second, More Is Better VP9 libvpx Encoding 1.13 Speed: Speed 0 - Input: Bosphorus 1080p e d c b 3 6 9 12 15 13.47 13.41 13.50 13.46 1. (CXX) g++ options: -m64 -lm -lpthread -O3 -fPIC -U_FORTIFY_SOURCE -std=gnu++11
OpenBenchmarking.org Frames Per Second, More Is Better VP9 libvpx Encoding 1.13 Speed: Speed 5 - Input: Bosphorus 1080p e d c b 6 12 18 24 30 27.50 27.54 27.55 27.48 1. (CXX) g++ options: -m64 -lm -lpthread -O3 -fPIC -U_FORTIFY_SOURCE -std=gnu++11
VVenC VVenC is the Fraunhofer Versatile Video Encoder as a fast/efficient H.266/VVC encoder. The vvenc encoder makes use of SIMD Everywhere (SIMDe). The vvenc software is published under The Clear BSD License. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 4K - Video Preset: Fast e d c b 1.035 2.07 3.105 4.14 5.175 4.594 4.595 4.600 4.592 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 4K - Video Preset: Faster e d c b 3 6 9 12 15 9.702 9.709 9.723 9.727 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 1080p - Video Preset: Fast e d c b 3 6 9 12 15 10.34 10.37 10.36 10.32 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 1080p - Video Preset: Faster e d c b 6 12 18 24 30 24.68 24.62 24.60 24.95 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
OpenBenchmarking.org Items / Sec, More Is Better OpenVKL 1.3.1 Benchmark: vklBenchmark Scalar e d c b 50 100 150 200 250 221 220 220 221 MIN: 22 / MAX: 3782 MIN: 22 / MAX: 3718 MIN: 22 / MAX: 3801 MIN: 22 / MAX: 3782
Build: allmodconfig
b: The test quit with a non-zero exit status.
c: The test quit with a non-zero exit status.
d: The test quit with a non-zero exit status.
e: The test quit with a non-zero exit status.
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU e d c b 0.3873 0.7746 1.1619 1.5492 1.9365 1.68216 1.72131 1.60645 1.58741 MIN: 1.47 MIN: 1.52 MIN: 1.41 MIN: 1.39 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU e d c b 0.8073 1.6146 2.4219 3.2292 4.0365 3.56308 3.57072 3.57425 3.58814 MIN: 3.48 MIN: 3.51 MIN: 3.51 MIN: 3.52 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU e d c b 0.2621 0.5242 0.7863 1.0484 1.3105 1.14607 1.16502 1.09886 1.15028 MIN: 0.91 MIN: 0.91 MIN: 0.91 MIN: 0.91 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU e d c b 0.1481 0.2962 0.4443 0.5924 0.7405 0.601752 0.600855 0.658305 0.631305 MIN: 0.5 MIN: 0.5 MIN: 0.52 MIN: 0.51 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU
b: The test run did not produce a result.
c: The test run did not produce a result.
d: The test run did not produce a result.
e: The test run did not produce a result.
Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU
b: The test run did not produce a result.
c: The test run did not produce a result.
d: The test run did not produce a result.
e: The test run did not produce a result.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU e d c b 0.4336 0.8672 1.3008 1.7344 2.168 1.92158 1.92723 1.92700 1.92273 MIN: 1.77 MIN: 1.77 MIN: 1.79 MIN: 1.78 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU e d c b 1.1181 2.2362 3.3543 4.4724 5.5905 4.96930 4.82696 4.76535 4.85692 MIN: 3.8 MIN: 3.81 MIN: 3.77 MIN: 3.78 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU e d c b 0.7081 1.4162 2.1243 2.8324 3.5405 3.13005 3.14697 3.13747 3.12492 MIN: 2.96 MIN: 2.96 MIN: 2.95 MIN: 2.95 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU e d c b 0.6484 1.2968 1.9452 2.5936 3.242 2.88171 2.86461 2.85112 2.84457 MIN: 2.72 MIN: 2.71 MIN: 2.73 MIN: 2.72 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU e d c b 0.2881 0.5762 0.8643 1.1524 1.4405 1.25603 1.26198 1.26959 1.28062 MIN: 1.06 MIN: 1.07 MIN: 1.08 MIN: 1.09 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU e d c b 0.2722 0.5444 0.8166 1.0888 1.361 1.20678 1.20774 1.20994 1.19509 MIN: 1.15 MIN: 1.05 MIN: 1.11 MIN: 1.1 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU e d c b 400 800 1200 1600 2000 1985.43 1979.77 1997.66 1945.00 MIN: 1950.2 MIN: 1945.25 MIN: 1960.5 MIN: 1908.31 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU e d c b 200 400 600 800 1000 1107.68 1111.00 1096.57 1079.44 MIN: 1061.2 MIN: 1067.05 MIN: 1045.56 MIN: 1036.98 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU e d c b 400 800 1200 1600 2000 1984.57 1895.74 1972.48 1995.62 MIN: 1946.76 MIN: 1864.49 MIN: 1938.36 MIN: 1960.81 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU
b: The test run did not produce a result.
c: The test run did not produce a result.
d: The test run did not produce a result.
e: The test run did not produce a result.
Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU
b: The test run did not produce a result.
c: The test run did not produce a result.
d: The test run did not produce a result.
e: The test run did not produce a result.
Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU
b: The test run did not produce a result.
c: The test run did not produce a result.
d: The test run did not produce a result.
e: The test run did not produce a result.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU e d c b 200 400 600 800 1000 1100.81 1101.41 1104.28 1021.40 MIN: 1049.32 MIN: 1058.06 MIN: 1062.57 MIN: 982.41 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU e d c b 2 4 6 8 10 1.58816 1.84734 5.73527 6.33545 MIN: 1.31 MIN: 1.55 MIN: 4.91 MIN: 5.37 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU e d c b 400 800 1200 1600 2000 1983.58 1824.16 1967.38 1808.92 MIN: 1939.97 MIN: 1793.47 MIN: 1927.09 MIN: 1779.23 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU e d c b 200 400 600 800 1000 1005.94 1104.18 1071.58 1082.10 MIN: 970.08 MIN: 1054.41 MIN: 1032.07 MIN: 1037.64 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU e d c b 3 6 9 12 15 9.08448 9.02263 8.77153 8.97089 MIN: 7.76 MIN: 7.83 MIN: 7.66 MIN: 7.79 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU
b: The test run did not produce a result.
c: The test run did not produce a result.
d: The test run did not produce a result.
e: The test run did not produce a result.
ClickHouse ClickHouse is an open-source, high performance OLAP data management system. This test profile uses ClickHouse's standard benchmark recommendations per https://clickhouse.com/docs/en/operations/performance-test/ / https://github.com/ClickHouse/ClickBench/tree/main/clickhouse with the 100 million rows web analytics dataset. The reported value is the query processing time using the geometric mean of all separate queries performed as an aggregate. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, First Run / Cold Cache e d c b 80 160 240 320 400 360.22 361.64 361.81 363.35 MIN: 21.47 / MAX: 6000 MIN: 21.16 / MAX: 5454.55 MIN: 21.22 / MAX: 6000 MIN: 20.96 / MAX: 6000
OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, Second Run e d c b 80 160 240 320 400 373.56 380.41 382.07 377.44 MIN: 21.3 / MAX: 6000 MIN: 20.99 / MAX: 6000 MIN: 21.28 / MAX: 6666.67 MIN: 21.36 / MAX: 6000
OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, Third Run e d c b 80 160 240 320 400 376.77 376.60 379.53 379.66 MIN: 22.03 / MAX: 6666.67 MIN: 21.36 / MAX: 5454.55 MIN: 21.34 / MAX: 6000 MIN: 21.67 / MAX: 6666.67
GROMACS The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2023 Implementation: MPI CPU - Input: water_GMX50_bare e d c b 0.8053 1.6106 2.4159 3.2212 4.0265 3.546 3.560 3.579 3.551 1. (CXX) g++ options: -O3
Blender Blender is an open-source 3D creation and modeling software project. This test is of Blender's Cycles performance with various sample files. GPU computing via NVIDIA OptiX and NVIDIA CUDA is currently supported as well as HIP for AMD Radeon GPUs and Intel oneAPI for Intel Graphics. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.4 Blend File: BMW27 - Compute: CPU-Only e d c b 11 22 33 44 55 49.01 49.01 49.07 48.92
OpenVINO This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Face Detection FP16 - Device: CPU e d c b 2 4 6 8 10 6.63 6.62 6.66 6.61 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Face Detection FP16 - Device: CPU e d c b 500 1000 1500 2000 2500 2392.30 2406.02 2370.86 2409.92 MIN: 2158.21 / MAX: 2458.54 MIN: 2302.07 / MAX: 2454.61 MIN: 2080.23 / MAX: 2467 MIN: 2320.16 / MAX: 2456.4 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Detection FP16 - Device: CPU e d c b 1.0035 2.007 3.0105 4.014 5.0175 4.45 4.42 4.46 4.43 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Detection FP16 - Device: CPU e d c b 800 1600 2400 3200 4000 3561.81 3579.97 3537.79 3550.16 MIN: 1861.47 / MAX: 3772.2 MIN: 3281.45 / MAX: 3778.06 MIN: 1858.95 / MAX: 3776.45 MIN: 3111.31 / MAX: 3789.27 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Detection FP32 - Device: CPU e d c b 1.0013 2.0026 3.0039 4.0052 5.0065 4.44 4.44 4.45 4.45 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Detection FP32 - Device: CPU e d c b 800 1600 2400 3200 4000 3564.28 3551.93 3557.92 3547.81 MIN: 3260.23 / MAX: 3777.76 MIN: 3173.2 / MAX: 3757.58 MIN: 3151 / MAX: 3760.66 MIN: 3206.96 / MAX: 3774.87 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16 - Device: CPU e d c b 130 260 390 520 650 602.06 602.60 601.06 604.83 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16 - Device: CPU e d c b 6 12 18 24 30 26.55 26.53 26.60 26.43 MIN: 15.36 / MAX: 56.41 MIN: 22.12 / MAX: 47.97 MIN: 16.91 / MAX: 50.5 MIN: 16.09 / MAX: 48.15 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Face Detection FP16-INT8 - Device: CPU e d c b 4 8 12 16 20 15.31 15.29 15.31 15.31 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Face Detection FP16-INT8 - Device: CPU e d c b 200 400 600 800 1000 1039.37 1040.49 1035.77 1041.29 MIN: 966.06 / MAX: 1056.28 MIN: 952.24 / MAX: 1058.21 MIN: 864.49 / MAX: 1056.01 MIN: 992.85 / MAX: 1055.65 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16-INT8 - Device: CPU e d c b 200 400 600 800 1000 1069.43 1070.10 1074.81 1071.24 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16-INT8 - Device: CPU e d c b 4 8 12 16 20 14.95 14.94 14.87 14.92 MIN: 13.29 / MAX: 21.14 MIN: 8.09 / MAX: 36.54 MIN: 11.03 / MAX: 26.77 MIN: 8.74 / MAX: 27.01 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16 - Device: CPU e d c b 140 280 420 560 700 653.82 652.78 652.95 653.00 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16 - Device: CPU e d c b 6 12 18 24 30 24.45 24.49 24.48 24.48 MIN: 19.54 / MAX: 29.68 MIN: 12.97 / MAX: 36.89 MIN: 13.99 / MAX: 38.7 MIN: 21.75 / MAX: 36.39 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Machine Translation EN To DE FP16 - Device: CPU e d c b 20 40 60 80 100 74.05 74.41 76.76 74.47 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Machine Translation EN To DE FP16 - Device: CPU e d c b 50 100 150 200 250 215.78 214.74 208.09 214.52 MIN: 185.9 / MAX: 238.93 MIN: 107.36 / MAX: 251.72 MIN: 130.67 / MAX: 239.9 MIN: 184.5 / MAX: 246.93 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16-INT8 - Device: CPU e d c b 300 600 900 1200 1500 1523.80 1523.04 1522.90 1524.17 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16-INT8 - Device: CPU e d c b 5 10 15 20 25 20.99 21.00 21.00 20.98 MIN: 12.52 / MAX: 25.05 MIN: 10.49 / MAX: 30.81 MIN: 10.02 / MAX: 33.74 MIN: 10.07 / MAX: 31.37 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Vehicle Bike Detection FP16 - Device: CPU e d c b 200 400 600 800 1000 979.33 980.65 982.44 978.89 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Vehicle Bike Detection FP16 - Device: CPU e d c b 4 8 12 16 20 16.31 16.29 16.26 16.32 MIN: 8.98 / MAX: 33.51 MIN: 9.55 / MAX: 28.65 MIN: 9.92 / MAX: 29.2 MIN: 13.76 / MAX: 25.98 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU e d c b 5K 10K 15K 20K 25K 21332.05 21354.52 21320.91 21367.23 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU e d c b 0.3353 0.6706 1.0059 1.3412 1.6765 1.49 1.49 1.49 1.49 MIN: 0.67 / MAX: 11.81 MIN: 0.9 / MAX: 13.38 MIN: 0.88 / MAX: 14.27 MIN: 0.89 / MAX: 14.55 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU e d c b 5K 10K 15K 20K 25K 23183.32 23163.08 23128.89 23189.30 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU e d c b 0.3083 0.6166 0.9249 1.2332 1.5415 1.37 1.37 1.37 1.37 MIN: 0.86 / MAX: 5.44 MIN: 0.83 / MAX: 13.21 MIN: 0.77 / MAX: 14.17 MIN: 0.83 / MAX: 13.67 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenBenchmarking.org Op/s, More Is Better RocksDB 7.9.2 Test: Random Read e d c b 30M 60M 90M 120M 150M 149987082 151165462 150808701 150864128 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
OpenBenchmarking.org Op/s, More Is Better RocksDB 7.9.2 Test: Update Random e d c b 150K 300K 450K 600K 750K 707380 712655 707117 710778 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
OpenBenchmarking.org Op/s, More Is Better RocksDB 7.9.2 Test: Sequential Fill e d c b 200K 400K 600K 800K 1000K 874204 866920 864582 868898 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
OpenBenchmarking.org Op/s, More Is Better RocksDB 7.9.2 Test: Random Fill Sync e d c b 50K 100K 150K 200K 250K 239475 244964 238213 242617 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
OpenBenchmarking.org Op/s, More Is Better RocksDB 7.9.2 Test: Read While Writing e d c b 1.1M 2.2M 3.3M 4.4M 5.5M 5135312 5186718 5120212 5140598 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
OpenBenchmarking.org Op/s, More Is Better RocksDB 7.9.2 Test: Read Random Write Random e d c b 600K 1200K 1800K 2400K 3000K 2771257 2768782 2744693 2755830 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
b Kernel Notes: Transparent Huge Pages: alwaysCompiler Notes: --build=x86_64-redhat-linux --disable-libunwind-exceptions --enable-__cxa_atexit --enable-bootstrap --enable-cet --enable-checking=release --enable-gnu-indirect-function --enable-gnu-unique-object --enable-host-bind-now --enable-host-pie --enable-initfini-array --enable-languages=c,c++,fortran,lto --enable-link-serialization=1 --enable-multilib --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-arch_32=x86-64 --with-arch_64=x86-64-v2 --with-build-config=bootstrap-lto --with-gcc-major-version-only --with-linker-hash-style=gnu --with-tune=generic --without-cuda-driver --without-islProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa001173Python Notes: Python 3.9.14Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 10 March 2023 18:03 by user .
c Kernel Notes: Transparent Huge Pages: alwaysCompiler Notes: --build=x86_64-redhat-linux --disable-libunwind-exceptions --enable-__cxa_atexit --enable-bootstrap --enable-cet --enable-checking=release --enable-gnu-indirect-function --enable-gnu-unique-object --enable-host-bind-now --enable-host-pie --enable-initfini-array --enable-languages=c,c++,fortran,lto --enable-link-serialization=1 --enable-multilib --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-arch_32=x86-64 --with-arch_64=x86-64-v2 --with-build-config=bootstrap-lto --with-gcc-major-version-only --with-linker-hash-style=gnu --with-tune=generic --without-cuda-driver --without-islProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa001173Python Notes: Python 3.9.14Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 10 March 2023 20:56 by user .
d Kernel Notes: Transparent Huge Pages: alwaysCompiler Notes: --build=x86_64-redhat-linux --disable-libunwind-exceptions --enable-__cxa_atexit --enable-bootstrap --enable-cet --enable-checking=release --enable-gnu-indirect-function --enable-gnu-unique-object --enable-host-bind-now --enable-host-pie --enable-initfini-array --enable-languages=c,c++,fortran,lto --enable-link-serialization=1 --enable-multilib --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-arch_32=x86-64 --with-arch_64=x86-64-v2 --with-build-config=bootstrap-lto --with-gcc-major-version-only --with-linker-hash-style=gnu --with-tune=generic --without-cuda-driver --without-islProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa001173Python Notes: Python 3.9.14Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 11 March 2023 00:28 by user .
e Processor: AMD EPYC 7513 32-Core @ 2.60GHz (32 Cores / 64 Threads), Motherboard: Supermicro H12SSL-i v1.02 (2.4 BIOS), Memory: 8 x 64 GB DDR4-3200MT/s Samsung M393A8G40AB2-CWE, Disk: 2 x 1920GB SAMSUNG MZQL21T9HCJR-00A07, Graphics: astdrmfb
OS: AlmaLinux 9.1, Kernel: 5.14.0-162.12.1.el9_1.x86_64 (x86_64), Compiler: GCC 11.3.1 20220421, File-System: ext4, Screen Resolution: 1024x768
Kernel Notes: Transparent Huge Pages: alwaysCompiler Notes: --build=x86_64-redhat-linux --disable-libunwind-exceptions --enable-__cxa_atexit --enable-bootstrap --enable-cet --enable-checking=release --enable-gnu-indirect-function --enable-gnu-unique-object --enable-host-bind-now --enable-host-pie --enable-initfini-array --enable-languages=c,c++,fortran,lto --enable-link-serialization=1 --enable-multilib --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-arch_32=x86-64 --with-arch_64=x86-64-v2 --with-build-config=bootstrap-lto --with-gcc-major-version-only --with-linker-hash-style=gnu --with-tune=generic --without-cuda-driver --without-islProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa001173Python Notes: Python 3.9.14Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 11 March 2023 05:12 by user .