AMD Ryzen 9 7900 12-Core testing with a ASRockRack 1U4LW-B650/2L2T B650D4U-2L2T/BCM (2.09 BIOS) and ASPEED on Ubuntu 24.04 via the Phoronix Test Suite.
2 x 32GB DDR5-4800 Processor: AMD Ryzen 9 7900 12-Core @ 5.48GHz (12 Cores / 24 Threads), Motherboard: ASRockRack 1U4LW-B650/2L2T B650D4U-2L2T/BCM (2.09 BIOS), Chipset: AMD Device 14d8, Memory: 2 x 32GB DDR5-4800MT/s Micron MTC20C2085S1EC48BA1, Disk: 1024GB SOLIDIGM SSDPFKNU010TZ, Graphics: ASPEED, Audio: AMD Rembrandt Radeon HD Audio, Monitor: VA2431, Network: 2 x Intel I210 + 2 x Broadcom BCM57416 NetXtreme-E Dual-Media 10G RDMA
OS: Ubuntu 24.04, Kernel: 6.8.0-31-generic (x86_64), Desktop: GNOME Shell 46.0, Display Server: X Server, Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1920x1200
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa601203Python Notes: Python 3.12.3Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Vulnerable: Safe RET no microcode + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
ASTC Encoder ASTC Encoder (astcenc) is for the Adaptive Scalable Texture Compression (ASTC) format commonly used with OpenGL, OpenGL ES, and Vulkan graphics APIs. This test profile does a coding test of both compression/decompression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.7 Preset: Fast 2 x 32GB DDR5-4800 60 120 180 240 300 267.75 1. (CXX) g++ options: -O3 -flto -pthread
Blender Blender is an open-source 3D creation and modeling software project. This test is of Blender's Cycles performance with various sample files. GPU computing via NVIDIA OptiX and NVIDIA CUDA is currently supported as well as HIP for AMD Radeon GPUs and Intel oneAPI for Intel Graphics. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: BMW27 - Compute: CPU-Only 2 x 32GB DDR5-4800 20 40 60 80 100 75.43
BRL-CAD BRL-CAD is a cross-platform, open-source solid modeling system with built-in benchmark mode. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.38.2 VGR Performance Metric 2 x 32GB DDR5-4800 60K 120K 180K 240K 300K 280941 1. (CXX) g++ options: -std=c++17 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -ltcl8.6 -lnetpbm -lregex_brl -lz_brl -lassimp -ldl -lm -ltk8.6
Embree Intel Embree is a collection of high-performance ray-tracing kernels for execution on CPUs (and GPUs via SYCL) and supporting instruction sets such as SSE, AVX, AVX2, and AVX-512. Embree also supports making use of the Intel SPMD Program Compiler (ISPC). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer ISPC - Model: Crown 2 x 32GB DDR5-4800 5 10 15 20 25 22.61 MIN: 22.4 / MAX: 22.99
OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer ISPC - Model: Asian Dragon 2 x 32GB DDR5-4800 6 12 18 24 30 25.27 MIN: 25.12 / MAX: 25.79
OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer ISPC - Model: Asian Dragon Obj 2 x 32GB DDR5-4800 5 10 15 20 25 21.58 MIN: 21.45 / MAX: 21.94
FFmpeg This is a benchmark of the FFmpeg multimedia framework. The FFmpeg test profile is making use of a modified version of vbench from Columbia University's Architecture and Design Lab (ARCADE) [http://arcade.cs.columbia.edu/vbench/] that is a benchmark for video-as-a-service workloads. The test profile offers the options of a range of vbench scenarios based on freely distributable video content and offers the options of using the x264 or x265 video encoders for transcoding. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FPS, More Is Better FFmpeg 7.0 Encoder: libx265 - Scenario: Live 2 x 32GB DDR5-4800 40 80 120 160 200 175.30 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
OpenBenchmarking.org FPS, More Is Better FFmpeg 7.0 Encoder: libx265 - Scenario: Upload 2 x 32GB DDR5-4800 8 16 24 32 40 33.66 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
OpenBenchmarking.org FPS, More Is Better FFmpeg 7.0 Encoder: libx265 - Scenario: Platform 2 x 32GB DDR5-4800 15 30 45 60 75 68.30 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
OpenBenchmarking.org FPS, More Is Better FFmpeg 7.0 Encoder: libx265 - Scenario: Video On Demand 2 x 32GB DDR5-4800 15 30 45 60 75 68.27 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample high resolution (currently 15400 x 6940) JPEG image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.43 Operation: Swirl 2 x 32GB DDR5-4800 40 80 120 160 200 187 1. (CC) gcc options: -fopenmp -O2 -ltiff -ljbig -lsharpyuv -lwebp -lwebpmux -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -lxml2 -lzstd -llzma -lbz2 -lz -lm -lpthread -lgomp
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.43 Operation: Rotate 2 x 32GB DDR5-4800 40 80 120 160 200 165 1. (CC) gcc options: -fopenmp -O2 -ltiff -ljbig -lsharpyuv -lwebp -lwebpmux -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -lxml2 -lzstd -llzma -lbz2 -lz -lm -lpthread -lgomp
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.43 Operation: Sharpen 2 x 32GB DDR5-4800 10 20 30 40 50 43 1. (CC) gcc options: -fopenmp -O2 -ltiff -ljbig -lsharpyuv -lwebp -lwebpmux -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -lxml2 -lzstd -llzma -lbz2 -lz -lm -lpthread -lgomp
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.43 Operation: Enhanced 2 x 32GB DDR5-4800 16 32 48 64 80 70 1. (CC) gcc options: -fopenmp -O2 -ltiff -ljbig -lsharpyuv -lwebp -lwebpmux -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -lxml2 -lzstd -llzma -lbz2 -lz -lm -lpthread -lgomp
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.43 Operation: Resizing 2 x 32GB DDR5-4800 60 120 180 240 300 289 1. (CC) gcc options: -fopenmp -O2 -ltiff -ljbig -lsharpyuv -lwebp -lwebpmux -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -lxml2 -lzstd -llzma -lbz2 -lz -lm -lpthread -lgomp
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.43 Operation: Noise-Gaussian 2 x 32GB DDR5-4800 20 40 60 80 100 91 1. (CC) gcc options: -fopenmp -O2 -ltiff -ljbig -lsharpyuv -lwebp -lwebpmux -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -lxml2 -lzstd -llzma -lbz2 -lz -lm -lpthread -lgomp
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.43 Operation: HWB Color Space 2 x 32GB DDR5-4800 60 120 180 240 300 259 1. (CC) gcc options: -fopenmp -O2 -ltiff -ljbig -lsharpyuv -lwebp -lwebpmux -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -lxml2 -lzstd -llzma -lbz2 -lz -lm -lpthread -lgomp
GROMACS The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2024 Implementation: MPI CPU - Input: water_GMX50_bare 2 x 32GB DDR5-4800 0.484 0.968 1.452 1.936 2.42 2.151 1. (CXX) g++ options: -O3 -lm
JPEG-XL Decoding libjxl The JPEG XL Image Coding System is designed to provide next-generation JPEG image capabilities with JPEG XL offering better image quality and compression over legacy JPEG. This test profile is suited for JPEG XL decode performance testing to PNG output file, the pts/jpexl test is for encode performance. The JPEG XL encoding/decoding is done using the libjxl codebase. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better JPEG-XL Decoding libjxl 0.10.1 CPU Threads: 1 2 x 32GB DDR5-4800 20 40 60 80 100 85.28
JPEG-XL libjxl The JPEG XL Image Coding System is designed to provide next-generation JPEG image capabilities with JPEG XL offering better image quality and compression over legacy JPEG. This test profile is currently focused on the multi-threaded JPEG XL image encode performance using the reference libjxl library. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: PNG - Quality: 80 2 x 32GB DDR5-4800 10 20 30 40 50 44.90 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
libxsmm Libxsmm is an open-source library for specialized dense and sparse matrix operations and deep learning primitives. Libxsmm supports making use of Intel AMX, AVX-512, and other modern CPU instruction set capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 32 2 x 32GB DDR5-4800 20 40 60 80 100 107.1 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -pedantic -O2 -fopenmp -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -march=core-avx2
Llama.cpp Llama.cpp is a port of Facebook's LLaMA model in C/C++ developed by Georgi Gerganov. Llama.cpp allows the inference of LLaMA and other supported models in C/C++. For CPU inference Llama.cpp supports AVX2/AVX-512, ARM NEON, and other modern ISAs along with features like OpenBLAS usage. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b1808 Model: llama-2-7b.Q4_0.gguf 2 x 32GB DDR5-4800 4 8 12 16 20 13.67 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas
OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b1808 Model: llama-2-13b.Q4_0.gguf 2 x 32GB DDR5-4800 2 4 6 8 10 7.3 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas
OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b1808 Model: llama-2-70b-chat.Q5_0.gguf 2 x 32GB DDR5-4800 0.2745 0.549 0.8235 1.098 1.3725 1.22 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas
Llamafile Mozilla's Llamafile allows distributing and running large language models (LLMs) as a single file. Llamafile aims to make open-source LLMs more accessible to developers and users. Llamafile supports a variety of models, CPUs and GPUs, and other options. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.7 Test: llava-v1.5-7b-q4 - Acceleration: CPU 2 x 32GB DDR5-4800 3 6 9 12 15 11.44
OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.7 Test: wizardcoder-python-34b-v1.0.Q6_K - Acceleration: CPU 2 x 32GB DDR5-4800 0.4545 0.909 1.3635 1.818 2.2725 2.02
LuxCoreRender LuxCoreRender is an open-source 3D physically based renderer formerly known as LuxRender. LuxCoreRender supports CPU-based rendering as well as GPU acceleration via OpenCL, NVIDIA CUDA, and NVIDIA OptiX interfaces. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.6 Scene: DLSC - Acceleration: CPU 2 x 32GB DDR5-4800 0.8145 1.629 2.4435 3.258 4.0725 3.62 MIN: 3.53 / MAX: 3.78
OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy - Array Size: 4096 MiB 2 x 32GB DDR5-4800 4K 8K 12K 16K 20K 17823.97 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy - Array Size: 8192 MiB 2 x 32GB DDR5-4800 4K 8K 12K 16K 20K 17709.53 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy, Fixed Block Size - Array Size: 1024 MiB 2 x 32GB DDR5-4800 3K 6K 9K 12K 15K 16190.13 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy, Fixed Block Size - Array Size: 4096 MiB 2 x 32GB DDR5-4800 3K 6K 9K 12K 15K 16019.16 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy, Fixed Block Size - Array Size: 8192 MiB 2 x 32GB DDR5-4800 4K 8K 12K 16K 20K 16405.97 1. (CC) gcc options: -O3 -march=native
NAMD NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD was developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ns/day, More Is Better NAMD 3.0b6 Input: ATPase with 327,506 Atoms 2 x 32GB DDR5-4800 0.5103 1.0206 1.5309 2.0412 2.5515 2.26790
NAS Parallel Benchmarks NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: BT.C 2 x 32GB DDR5-4800 8K 16K 24K 32K 40K 36869.25 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: CG.C 2 x 32GB DDR5-4800 2K 4K 6K 8K 10K 9531.48 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.C 2 x 32GB DDR5-4800 500 1000 1500 2000 2500 2189.28 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.D 2 x 32GB DDR5-4800 500 1000 1500 2000 2500 2178.72 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: FT.C 2 x 32GB DDR5-4800 5K 10K 15K 20K 25K 22528.92 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: IS.D 2 x 32GB DDR5-4800 300 600 900 1200 1500 1292.66 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: LU.C 2 x 32GB DDR5-4800 9K 18K 27K 36K 45K 41111.39 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: MG.C 2 x 32GB DDR5-4800 5K 10K 15K 20K 25K 22577.88 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: SP.B 2 x 32GB DDR5-4800 4K 8K 12K 16K 20K 18304.39 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: SP.C 2 x 32GB DDR5-4800 2K 4K 6K 8K 10K 11331.51 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
nginx This is a benchmark of the lightweight Nginx HTTP(S) web-server. This Nginx web server benchmark test profile makes use of the wrk program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients/connections. HTTPS with a self-signed OpenSSL certificate is used by this test for local benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 500 2 x 32GB DDR5-4800 30K 60K 90K 120K 150K 120941.12 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 1000 2 x 32GB DDR5-4800 20K 40K 60K 80K 100K 114755.2 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: IP Shapes 1D - Engine: CPU 2 x 32GB DDR5-4800 0.4207 0.8414 1.2621 1.6828 2.1035 1.86964 MIN: 1.82 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: IP Shapes 3D - Engine: CPU 2 x 32GB DDR5-4800 0.8383 1.6766 2.5149 3.3532 4.1915 3.72571 MIN: 3.69 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Convolution Batch Shapes Auto - Engine: CPU 2 x 32GB DDR5-4800 2 4 6 8 10 6.33617 MIN: 6.24 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Deconvolution Batch shapes_1d - Engine: CPU 2 x 32GB DDR5-4800 0.7378 1.4756 2.2134 2.9512 3.689 3.27921 MIN: 2.81 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Deconvolution Batch shapes_3d - Engine: CPU 2 x 32GB DDR5-4800 0.7736 1.5472 2.3208 3.0944 3.868 3.43803 MIN: 3.21 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Recurrent Neural Network Training - Engine: CPU 2 x 32GB DDR5-4800 400 800 1200 1600 2000 1865.66 MIN: 1857.6 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Recurrent Neural Network Inference - Engine: CPU 2 x 32GB DDR5-4800 200 400 600 800 1000 963.18 MIN: 955.07 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenFOAM OpenFOAM is the leading free, open-source software for computational fluid dynamics (CFD). This test profile currently uses the drivaerFastback test case for analyzing automotive aerodynamics or alternatively the older motorBike input. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Mesh Time 2 x 32GB DDR5-4800 7 14 21 28 35 28.68 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm
OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Execution Time 2 x 32GB DDR5-4800 40 80 120 160 200 196.38 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm
OpenRadioss OpenRadioss is an open-source AGPL-licensed finite element solver for dynamic event analysis OpenRadioss is based on Altair Radioss and open-sourced in 2022. This open-source finite element solver is benchmarked with various example models available from https://www.openradioss.org/models/ and https://github.com/OpenRadioss/ModelExchange/tree/main/Examples. This test is currently using a reference OpenRadioss binary build offered via GitHub. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2023.09.15 Model: Bumper Beam 2 x 32GB DDR5-4800 20 40 60 80 100 105.47
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: SHA256 2 x 32GB DDR5-4800 5000M 10000M 15000M 20000M 25000M 24692383490 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: SHA512 2 x 32GB DDR5-4800 2000M 4000M 6000M 8000M 10000M 8063332180 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org sign/s, More Is Better OpenSSL 3.3 Algorithm: RSA4096 2 x 32GB DDR5-4800 2K 4K 6K 8K 10K 9842.1 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org verify/s, More Is Better OpenSSL 3.3 Algorithm: RSA4096 2 x 32GB DDR5-4800 60K 120K 180K 240K 300K 279049.8 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: ChaCha20 2 x 32GB DDR5-4800 20000M 40000M 60000M 80000M 100000M 95727912000 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: AES-128-GCM 2 x 32GB DDR5-4800 40000M 80000M 120000M 160000M 200000M 180693439010 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: AES-256-GCM 2 x 32GB DDR5-4800 30000M 60000M 90000M 120000M 150000M 155271939960 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: ChaCha20-Poly1305 2 x 32GB DDR5-4800 15000M 30000M 45000M 60000M 75000M 67937414080 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenVINO This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection FP16 - Device: CPU 2 x 32GB DDR5-4800 3 6 9 12 15 9.57 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection FP16 - Device: CPU 2 x 32GB DDR5-4800 140 280 420 560 700 625.5 MIN: 596.94 / MAX: 641.38 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Detection FP16 - Device: CPU 2 x 32GB DDR5-4800 20 40 60 80 100 74.4 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Detection FP16 - Device: CPU 2 x 32GB DDR5-4800 20 40 60 80 100 80.58 MIN: 60.22 / MAX: 102.9 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Detection FP32 - Device: CPU 2 x 32GB DDR5-4800 20 40 60 80 100 74.91 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Detection FP32 - Device: CPU 2 x 32GB DDR5-4800 20 40 60 80 100 80.04 MIN: 61.76 / MAX: 107.58 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16 - Device: CPU 2 x 32GB DDR5-4800 140 280 420 560 700 645.63 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16 - Device: CPU 2 x 32GB DDR5-4800 3 6 9 12 15 9.26 MIN: 4.6 / MAX: 15.49 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection FP16-INT8 - Device: CPU 2 x 32GB DDR5-4800 5 10 15 20 25 18.57 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection FP16-INT8 - Device: CPU 2 x 32GB DDR5-4800 70 140 210 280 350 322.63 MIN: 296.85 / MAX: 328.51 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16 - Device: CPU 2 x 32GB DDR5-4800 500 1000 1500 2000 2500 2212.08 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16 - Device: CPU 2 x 32GB DDR5-4800 0.5985 1.197 1.7955 2.394 2.9925 2.66 MIN: 1.38 / MAX: 17.22 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16 - Device: CPU 2 x 32GB DDR5-4800 60 120 180 240 300 281.64 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16 - Device: CPU 2 x 32GB DDR5-4800 5 10 15 20 25 21.27 MIN: 9.77 / MAX: 37.47 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16-INT8 - Device: CPU 2 x 32GB DDR5-4800 200 400 600 800 1000 1152.68 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16-INT8 - Device: CPU 2 x 32GB DDR5-4800 1.1655 2.331 3.4965 4.662 5.8275 5.18 MIN: 2.88 / MAX: 13.18 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16 - Device: CPU 2 x 32GB DDR5-4800 200 400 600 800 1000 961.04 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16 - Device: CPU 2 x 32GB DDR5-4800 3 6 9 12 15 12.46 MIN: 6.31 / MAX: 17.78 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16-INT8 - Device: CPU 2 x 32GB DDR5-4800 700 1400 2100 2800 3500 3252.98 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16-INT8 - Device: CPU 2 x 32GB DDR5-4800 0.81 1.62 2.43 3.24 4.05 3.6 MIN: 1.94 / MAX: 11.06 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU 2 x 32GB DDR5-4800 80 160 240 320 400 345.8 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU 2 x 32GB DDR5-4800 4 8 12 16 20 17.32 MIN: 8.73 / MAX: 23.67 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Machine Translation EN To DE FP16 - Device: CPU 2 x 32GB DDR5-4800 20 40 60 80 100 99.05 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Machine Translation EN To DE FP16 - Device: CPU 2 x 32GB DDR5-4800 14 28 42 56 70 60.53 MIN: 33.29 / MAX: 82.21 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16-INT8 - Device: CPU 2 x 32GB DDR5-4800 400 800 1200 1600 2000 1885.03 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16-INT8 - Device: CPU 2 x 32GB DDR5-4800 2 4 6 8 10 6.33 MIN: 3.26 / MAX: 19.39 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Vehicle Bike Detection FP16 - Device: CPU 2 x 32GB DDR5-4800 200 400 600 800 1000 1090.85 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Vehicle Bike Detection FP16 - Device: CPU 2 x 32GB DDR5-4800 1.233 2.466 3.699 4.932 6.165 5.48 MIN: 3.89 / MAX: 9.6 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Noise Suppression Poconet-Like FP16 - Device: CPU 2 x 32GB DDR5-4800 300 600 900 1200 1500 1246.91 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Noise Suppression Poconet-Like FP16 - Device: CPU 2 x 32GB DDR5-4800 3 6 9 12 15 9.47 MIN: 6.62 / MAX: 12.05 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16 - Device: CPU 2 x 32GB DDR5-4800 110 220 330 440 550 513.48 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16 - Device: CPU 2 x 32GB DDR5-4800 6 12 18 24 30 23.34 MIN: 14.47 / MAX: 31.02 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Re-Identification Retail FP16 - Device: CPU 2 x 32GB DDR5-4800 300 600 900 1200 1500 1316.48 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Re-Identification Retail FP16 - Device: CPU 2 x 32GB DDR5-4800 1.0215 2.043 3.0645 4.086 5.1075 4.54 MIN: 2.61 / MAX: 10.15 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU 2 x 32GB DDR5-4800 5K 10K 15K 20K 25K 24761.85 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU 2 x 32GB DDR5-4800 0.1013 0.2026 0.3039 0.4052 0.5065 0.45 MIN: 0.22 / MAX: 8.77 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16-INT8 - Device: CPU 2 x 32GB DDR5-4800 120 240 360 480 600 553.36 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16-INT8 - Device: CPU 2 x 32GB DDR5-4800 5 10 15 20 25 21.66 MIN: 17.58 / MAX: 38.77 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU 2 x 32GB DDR5-4800 7K 14K 21K 28K 35K 34985.34 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU 2 x 32GB DDR5-4800 0.0698 0.1396 0.2094 0.2792 0.349 0.31 MIN: 0.16 / MAX: 6.53 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OSPRay Intel OSPRay is a portable ray-tracing engine for high-performance, high-fidelity scientific visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.1 Benchmark: particle_volume/ao/real_time 2 x 32GB DDR5-4800 2 4 6 8 10 6.45817
OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.1 Benchmark: gravity_spheres_volume/dim_512/ao/real_time 2 x 32GB DDR5-4800 1.2371 2.4742 3.7113 4.9484 6.1855 5.49836
OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.1 Benchmark: gravity_spheres_volume/dim_512/scivis/real_time 2 x 32GB DDR5-4800 1.2203 2.4406 3.6609 4.8812 6.1015 5.42347
OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.1 Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time 2 x 32GB DDR5-4800 2 4 6 8 10 6.54819
OSPRay Studio Intel OSPRay Studio is an open-source, interactive visualization and ray-tracing software package. OSPRay Studio makes use of Intel OSPRay, a portable ray-tracing engine for high-performance, high-fidelity visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU 2 x 32GB DDR5-4800 1300 2600 3900 5200 6500 5957
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU 2 x 32GB DDR5-4800 1300 2600 3900 5200 6500 6027
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU 2 x 32GB DDR5-4800 1500 3000 4500 6000 7500 7065
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU 2 x 32GB DDR5-4800 20K 40K 60K 80K 100K 99448
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU 2 x 32GB DDR5-4800 40K 80K 120K 160K 200K 194961
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU 2 x 32GB DDR5-4800 20K 40K 60K 80K 100K 100245
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU 2 x 32GB DDR5-4800 40K 80K 120K 160K 200K 196691
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU 2 x 32GB DDR5-4800 30K 60K 90K 120K 150K 116796
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU 2 x 32GB DDR5-4800 50K 100K 150K 200K 250K 229326
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 1080p - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU 2 x 32GB DDR5-4800 300 600 900 1200 1500 1503
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 1080p - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU 2 x 32GB DDR5-4800 300 600 900 1200 1500 1514
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 1080p - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU 2 x 32GB DDR5-4800 400 800 1200 1600 2000 1770
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 1080p - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU 2 x 32GB DDR5-4800 5K 10K 15K 20K 25K 23994
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU 2 x 32GB DDR5-4800 11K 22K 33K 44K 55K 51162
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 1080p - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU 2 x 32GB DDR5-4800 5K 10K 15K 20K 25K 24232
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU 2 x 32GB DDR5-4800 11K 22K 33K 44K 55K 52245
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 1080p - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU 2 x 32GB DDR5-4800 7K 14K 21K 28K 35K 32273
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU 2 x 32GB DDR5-4800 13K 26K 39K 52K 65K 60379
OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 16 Scaling Factor: 100 - Clients: 1000 - Mode: Read Only - Average Latency 2 x 32GB DDR5-4800 0.3116 0.6232 0.9348 1.2464 1.558 1.385 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
OpenBenchmarking.org TPS, More Is Better PostgreSQL 16 Scaling Factor: 100 - Clients: 1000 - Mode: Read Write 2 x 32GB DDR5-4800 7K 14K 21K 28K 35K 33933 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 16 Scaling Factor: 100 - Clients: 1000 - Mode: Read Write - Average Latency 2 x 32GB DDR5-4800 7 14 21 28 35 29.47 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
Quicksilver Quicksilver is a proxy application that represents some elements of the Mercury workload by solving a simplified dynamic Monte Carlo particle transport problem. Quicksilver is developed by Lawrence Livermore National Laboratory (LLNL) and this test profile currently makes use of the OpenMP CPU threaded code path. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Figure Of Merit, More Is Better Quicksilver 20230818 Input: CTS2 2 x 32GB DDR5-4800 3M 6M 9M 12M 15M 16160000 1. (CXX) g++ options: -fopenmp -O3 -march=native
RNNoise RNNoise is a recurrent neural network for audio noise reduction developed by Mozilla and Xiph.Org. This test profile is a single-threaded test measuring the time to denoise a sample 26 minute long 16-bit RAW audio file using this recurrent neural network noise suppression library. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better RNNoise 0.2 Input: 26 Minute Long Talking Sample 2 x 32GB DDR5-4800 2 4 6 8 10 7.54 1. (CC) gcc options: -O2 -pedantic -fvisibility=hidden
OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Update Random 2 x 32GB DDR5-4800 150K 300K 450K 600K 750K 703415 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Read While Writing 2 x 32GB DDR5-4800 700K 1400K 2100K 2800K 3500K 3174105 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Read Random Write Random 2 x 32GB DDR5-4800 500K 1000K 1500K 2000K 2500K 2542969 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
SPECFEM3D simulates acoustic (fluid), elastic (solid), coupled acoustic/elastic, poroelastic or seismic wave propagation in any type of conforming mesh of hexahedra. This test profile currently relies on CPU-based execution for SPECFEM3D and using a variety of their built-in examples/models for benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.1.1 Model: Mount St. Helens 2 x 32GB DDR5-4800 9 18 27 36 45 37.20 1. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.1.1 Model: Layered Halfspace 2 x 32GB DDR5-4800 20 40 60 80 100 102.83 1. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.1.1 Model: Tomographic Model 2 x 32GB DDR5-4800 8 16 24 32 40 36.59 1. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.1.1 Model: Homogeneous Halfspace 2 x 32GB DDR5-4800 11 22 33 44 55 47.02 1. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.1.1 Model: Water-layered Halfspace 2 x 32GB DDR5-4800 20 40 60 80 100 100.60 1. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
srsRAN Project srsRAN Project is a complete ORAN-native 5G RAN solution created by Software Radio Systems (SRS). The srsRAN Project radio suite was formerly known as srsLTE and can be used for building your own software-defined radio (SDR) 4G/5G mobile network. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240325 Test: PDSCH Processor Benchmark, Throughput Total 2 x 32GB DDR5-4800 2K 4K 6K 8K 10K 11196.8 1. (CXX) g++ options: -O3 -march=native -mavx2 -mavx -msse4.1 -mfma -mavx512f -mavx512cd -mavx512bw -mavx512dq -fno-trapping-math -fno-math-errno -ldl
OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240325 Test: PUSCH Processor Benchmark, Throughput Total 2 x 32GB DDR5-4800 300 600 900 1200 1500 1559.3 MIN: 954.9 1. (CXX) g++ options: -O3 -march=native -mavx2 -mavx -msse4.1 -mfma -mavx512f -mavx512cd -mavx512bw -mavx512dq -fno-trapping-math -fno-math-errno -ldl
OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240325 Test: PDSCH Processor Benchmark, Throughput Thread 2 x 32GB DDR5-4800 200 400 600 800 1000 882.5 1. (CXX) g++ options: -O3 -march=native -mavx2 -mavx -msse4.1 -mfma -mavx512f -mavx512cd -mavx512bw -mavx512dq -fno-trapping-math -fno-math-errno -ldl
OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240325 Test: PUSCH Processor Benchmark, Throughput Thread 2 x 32GB DDR5-4800 40 80 120 160 200 183.4 MIN: 105.9 1. (CXX) g++ options: -O3 -march=native -mavx2 -mavx -msse4.1 -mfma -mavx512f -mavx512cd -mavx512bw -mavx512dq -fno-trapping-math -fno-math-errno -ldl
Stockfish This is a test of Stockfish, an advanced open-source C++11 chess benchmark that can scale up to 1024 CPU threads. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 16.1 Chess Benchmark 2 x 32GB DDR5-4800 6M 12M 18M 24M 30M 30072838 1. (CXX) g++ options: -lgcov -m64 -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -funroll-loops -msse -msse3 -mpopcnt -mavx2 -mbmi -mavx512f -mavx512bw -mavx512vnni -mavx512dq -mavx512vl -msse4.1 -mssse3 -msse2 -mbmi2 -flto -flto-partition=one -flto=jobserver
SVT-AV1 This is a benchmark of the SVT-AV1 open-source video encoder/decoder. SVT-AV1 was originally developed by Intel as part of their Open Visual Cloud / Scalable Video Technology (SVT). Development of SVT-AV1 has since moved to the Alliance for Open Media as part of upstream AV1 development. SVT-AV1 is a CPU-based multi-threaded video encoder for the AV1 video format with a sample YUV video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 4 - Input: Bosphorus 4K 2 x 32GB DDR5-4800 2 4 6 8 10 6.679 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 8 - Input: Bosphorus 4K 2 x 32GB DDR5-4800 13 26 39 52 65 57.44 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 12 - Input: Bosphorus 4K 2 x 32GB DDR5-4800 30 60 90 120 150 145.78 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 13 - Input: Bosphorus 4K 2 x 32GB DDR5-4800 30 60 90 120 150 146.71 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 4 - Input: Bosphorus 1080p 2 x 32GB DDR5-4800 5 10 15 20 25 22.34 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 8 - Input: Bosphorus 1080p 2 x 32GB DDR5-4800 40 80 120 160 200 163.52 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 12 - Input: Bosphorus 1080p 2 x 32GB DDR5-4800 120 240 360 480 600 565.03 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 13 - Input: Bosphorus 1080p 2 x 32GB DDR5-4800 140 280 420 560 700 648.57 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
TensorFlow This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 1 - Model: AlexNet 2 x 32GB DDR5-4800 4 8 12 16 20 14.48
VVenC VVenC is the Fraunhofer Versatile Video Encoder as a fast/efficient H.266/VVC encoder. The vvenc encoder makes use of SIMD Everywhere (SIMDe). The vvenc software is published under the Clear BSD License. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.11 Video Input: Bosphorus 4K - Video Preset: Fast 2 x 32GB DDR5-4800 2 4 6 8 10 6.529 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.11 Video Input: Bosphorus 4K - Video Preset: Faster 2 x 32GB DDR5-4800 4 8 12 16 20 14.93 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.11 Video Input: Bosphorus 1080p - Video Preset: Fast 2 x 32GB DDR5-4800 5 10 15 20 25 20.41 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.11 Video Input: Bosphorus 1080p - Video Preset: Faster 2 x 32GB DDR5-4800 11 22 33 44 55 46.64 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
Xcompact3d Incompact3d Xcompact3d Incompact3d is a Fortran-MPI based, finite difference high-performance code for solving the incompressible Navier-Stokes equation and as many as you need scalar transport equations. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 129 Cells Per Direction 2 x 32GB DDR5-4800 4 8 12 16 20 16.67 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 193 Cells Per Direction 2 x 32GB DDR5-4800 20 40 60 80 100 79.43 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
2 x 32GB DDR5-4800 Processor: AMD Ryzen 9 7900 12-Core @ 5.48GHz (12 Cores / 24 Threads), Motherboard: ASRockRack 1U4LW-B650/2L2T B650D4U-2L2T/BCM (2.09 BIOS), Chipset: AMD Device 14d8, Memory: 2 x 32GB DDR5-4800MT/s Micron MTC20C2085S1EC48BA1, Disk: 1024GB SOLIDIGM SSDPFKNU010TZ, Graphics: ASPEED, Audio: AMD Rembrandt Radeon HD Audio, Monitor: VA2431, Network: 2 x Intel I210 + 2 x Broadcom BCM57416 NetXtreme-E Dual-Media 10G RDMA
OS: Ubuntu 24.04, Kernel: 6.8.0-31-generic (x86_64), Desktop: GNOME Shell 46.0, Display Server: X Server, Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1920x1200
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa601203Python Notes: Python 3.12.3Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Vulnerable: Safe RET no microcode + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 23 April 2024 23:07 by user phoronix.