AMD EPYC 9755 with varying DDR5-6000 default versus DDR5-4800 memory performance. Benchmarks by Michael Larabel for a future article.
DDR5-4800 Processor: AMD EPYC 9755 128-Core @ 2.70GHz (128 Cores / 256 Threads), Motherboard: AMD VOLCANO (RVOT1000D BIOS), Chipset: AMD Device 153a, Memory: 12 x 64GB DDR5-4800MT/s Samsung M321R8GA0PB1-CCPKC, Disk: 2 x 1920GB KIOXIA KCD8XPUG1T92, Graphics: ASPEED, Network: Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 24.04, Kernel: 6.10.0-phx (x86_64), Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1920x1200
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-OiuXZC/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-OiuXZC/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xb002110Java Notes: OpenJDK Runtime Environment (build 21.0.3-ea+7-Ubuntu-1build1)Python Notes: Python 3.12.2Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
DDR5-6000 Changed Memory to 12 x 64GB DDR5-6000MT/s Samsung M321R8GA0PB1-CCPKC .
OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 22.01 Test: Decompression Rating DDR5-4800 DDR5-6000 200K 400K 600K 800K 1000K SE +/- 719.43, N = 3 SE +/- 276.36, N = 3 845631 846736 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
Algebraic Multi-Grid Benchmark AMG is a parallel algebraic multigrid solver for linear systems arising from problems on unstructured grids. The driver provided with AMG builds linear systems for various 3-dimensional problems. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Figure Of Merit, More Is Better Algebraic Multi-Grid Benchmark 1.2 DDR5-4800 DDR5-6000 700M 1400M 2100M 2800M 3500M SE +/- 2882733.14, N = 3 SE +/- 6342005.91, N = 3 2698476000 3177082000 1. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -lmpi
Apache IoTDB Apache IotDB is a time series database and this benchmark is facilitated using the IoT Benchmaark [https://github.com/thulab/iot-benchmark/]. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 400 DDR5-4800 DDR5-6000 30M 60M 90M 120M 150M SE +/- 266876.94, N = 3 SE +/- 120972.23, N = 3 118270441 117516797
OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 400 DDR5-4800 DDR5-6000 50 100 150 200 250 SE +/- 1.40, N = 3 SE +/- 0.92, N = 3 225.70 225.87 MAX: 26697.41 MAX: 26595.42
OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.2 Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 100 DDR5-4800 DDR5-6000 30M 60M 90M 120M 150M SE +/- 33011.72, N = 3 SE +/- 338573.54, N = 3 133128010 134210671
OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.2 Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 800 - Client Number: 100 DDR5-4800 DDR5-6000 13 26 39 52 65 SE +/- 0.13, N = 3 SE +/- 0.13, N = 3 57.58 57.30 MAX: 23818.56 MAX: 23846.01
ASKAP ASKAP is a set of benchmarks from the Australian SKA Pathfinder. The principal ASKAP benchmarks are the Hogbom Clean Benchmark (tHogbomClean) and Convolutional Resamping Benchmark (tConvolve) as well as some previous ASKAP benchmarks being included as well for OpenCL and CUDA execution of tConvolve. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Degridding DDR5-4800 DDR5-6000 15K 30K 45K 60K 75K SE +/- 610.89, N = 3 SE +/- 395.30, N = 3 66651.4 70762.8 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Gridding DDR5-4800 DDR5-6000 16K 32K 48K 64K 80K SE +/- 752.40, N = 3 SE +/- 438.43, N = 3 68467.4 74093.3 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve MT - Gridding DDR5-4800 DDR5-6000 4K 8K 12K 16K 20K SE +/- 9.34, N = 3 SE +/- 0.00, N = 3 14143.3 16464.1 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve MT - Degridding DDR5-4800 DDR5-6000 6K 12K 18K 24K 30K SE +/- 10.95, N = 3 SE +/- 28.93, N = 3 22665.1 27297.4 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASTC Encoder ASTC Encoder (astcenc) is for the Adaptive Scalable Texture Compression (ASTC) format commonly used with OpenGL, OpenGL ES, and Vulkan graphics APIs. This test profile does a coding test of both compression/decompression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.7 Preset: Medium DDR5-4800 DDR5-6000 150 300 450 600 750 SE +/- 5.72, N = 3 SE +/- 3.94, N = 8 697.19 701.73 1. (CXX) g++ options: -O3 -flto -pthread
OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.7 Preset: Thorough DDR5-4800 DDR5-6000 20 40 60 80 100 SE +/- 0.01, N = 3 SE +/- 0.02, N = 6 110.03 110.04 1. (CXX) g++ options: -O3 -flto -pthread
OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.7 Preset: Very Thorough DDR5-4800 DDR5-6000 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 15.71 15.72 1. (CXX) g++ options: -O3 -flto -pthread
OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.7 Preset: Exhaustive DDR5-4800 DDR5-6000 3 6 9 12 15 SE +/- 0.0032, N = 3 SE +/- 0.0004, N = 3 9.6576 9.6576 1. (CXX) g++ options: -O3 -flto -pthread
Blender Blender is an open-source 3D creation and modeling software project. This test is of Blender's Cycles performance with various sample files. GPU computing via NVIDIA OptiX and NVIDIA CUDA is currently supported as well as HIP for AMD Radeon GPUs and Intel oneAPI for Intel Graphics. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: BMW27 - Compute: CPU-Only DDR5-4800 DDR5-6000 3 6 9 12 15 SE +/- 0.08, N = 3 SE +/- 0.01, N = 5 9.63 9.46
OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: Classroom - Compute: CPU-Only DDR5-4800 DDR5-6000 6 12 18 24 30 SE +/- 0.04, N = 3 SE +/- 0.05, N = 3 24.21 24.00
OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: Fishy Cat - Compute: CPU-Only DDR5-4800 DDR5-6000 3 6 9 12 15 SE +/- 0.05, N = 3 SE +/- 0.01, N = 4 12.53 12.50
OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: Pabellon Barcelona - Compute: CPU-Only DDR5-4800 DDR5-6000 7 14 21 28 35 SE +/- 0.10, N = 3 SE +/- 0.06, N = 3 29.90 29.75
OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: Barbershop - Compute: CPU-Only DDR5-4800 DDR5-6000 20 40 60 80 100 SE +/- 0.08, N = 3 SE +/- 0.12, N = 3 87.28 86.62
OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: Junkshop - Compute: CPU-Only DDR5-4800 DDR5-6000 3 6 9 12 15 SE +/- 0.08, N = 3 SE +/- 0.03, N = 4 12.98 12.71
BRL-CAD BRL-CAD is a cross-platform, open-source solid modeling system with built-in benchmark mode. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.38.2 VGR Performance Metric DDR5-4800 DDR5-6000 1.3M 2.6M 3.9M 5.2M 6.5M 5848380 5899055 1. (CXX) g++ options: -std=c++17 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -ltcl8.6 -lnetpbm -lregex_brl -lz_brl -lassimp -ldl -lm -ltk8.6
ClickHouse ClickHouse is an open-source, high performance OLAP data management system. This test profile uses ClickHouse's standard benchmark recommendations per https://clickhouse.com/docs/en/operations/performance-test/ / https://github.com/ClickHouse/ClickBench/tree/main/clickhouse with the 100 million rows web analytics dataset. The reported value is the query processing time using the geometric mean of all separate queries performed as an aggregate. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, First Run / Cold Cache DDR5-4800 DDR5-6000 150 300 450 600 750 SE +/- 4.99, N = 3 SE +/- 6.12, N = 3 680.81 698.12 MIN: 82.42 / MAX: 6666.67 MIN: 80.86 / MAX: 6666.67
OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, Second Run DDR5-4800 DDR5-6000 160 320 480 640 800 SE +/- 2.33, N = 3 SE +/- 5.27, N = 3 693.77 729.56 MIN: 84.27 / MAX: 6666.67 MIN: 81.63 / MAX: 7500
OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, Third Run DDR5-4800 DDR5-6000 160 320 480 640 800 SE +/- 2.37, N = 3 SE +/- 4.70, N = 3 704.36 724.64 MIN: 83.45 / MAX: 6666.67 MIN: 85.11 / MAX: 6666.67
OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf 1.3 Input: clover_bm16 DDR5-4800 DDR5-6000 50 100 150 200 250 SE +/- 0.08, N = 3 SE +/- 0.24, N = 3 209.96 178.93 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
Embree Intel Embree is a collection of high-performance ray-tracing kernels for execution on CPUs (and GPUs via SYCL) and supporting instruction sets such as SSE, AVX, AVX2, and AVX-512. Embree also supports making use of the Intel SPMD Program Compiler (ISPC). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer ISPC - Model: Asian Dragon DDR5-4800 DDR5-6000 50 100 150 200 250 SE +/- 0.10, N = 3 SE +/- 0.06, N = 8 220.00 223.21 MIN: 216.78 / MAX: 224.82 MIN: 219.93 / MAX: 227.73
OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer ISPC - Model: Asian Dragon Obj DDR5-4800 DDR5-6000 40 80 120 160 200 SE +/- 0.10, N = 3 SE +/- 0.14, N = 5 189.00 191.78 MIN: 186.02 / MAX: 192.57 MIN: 188.36 / MAX: 196.2
OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer ISPC - Model: Crown DDR5-4800 DDR5-6000 40 80 120 160 200 SE +/- 0.40, N = 3 SE +/- 0.11, N = 8 178.52 179.45 MIN: 174.12 / MAX: 184.84 MIN: 175.01 / MAX: 186.64
FFmpeg This is a benchmark of the FFmpeg multimedia framework. The FFmpeg test profile is making use of a modified version of vbench from Columbia University's Architecture and Design Lab (ARCADE) [http://arcade.cs.columbia.edu/vbench/] that is a benchmark for video-as-a-service workloads. The test profile offers the options of a range of vbench scenarios based on freely distributable video content and offers the options of using the x264 or x265 video encoders for transcoding. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FPS, More Is Better FFmpeg 7.0 Encoder: libx265 - Scenario: Upload DDR5-4800 DDR5-6000 8 16 24 32 40 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 32.57 32.34 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
OpenBenchmarking.org FPS, More Is Better FFmpeg 7.0 Encoder: libx265 - Scenario: Platform DDR5-4800 DDR5-6000 15 30 45 60 75 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 67.14 66.60 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
OpenBenchmarking.org FPS, More Is Better FFmpeg 7.0 Encoder: libx265 - Scenario: Video On Demand DDR5-4800 DDR5-6000 15 30 45 60 75 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 67.05 66.51 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Google SynthMark SynthMark is a cross platform tool for benchmarking CPU performance under a variety of real-time audio workloads. It uses a polyphonic synthesizer model to provide standardized tests for latency, jitter and computational throughput. Learn more via the OpenBenchmarking.org test page.
Test: VoiceMark_100
DDR5-4800: The test run did not produce a result.
DDR5-6000: The test run did not produce a result.
GPAW GPAW is a density-functional theory (DFT) Python code based on the projector-augmented wave (PAW) method and the atomic simulation environment (ASE). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better GPAW 23.6 Input: Carbon Nanotube DDR5-4800 DDR5-6000 6 12 18 24 30 SE +/- 0.13, N = 3 SE +/- 0.10, N = 3 24.30 23.82 1. (CC) gcc options: -shared -lxc -lblas -lmpi
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample high resolution (currently 15400 x 6940) JPEG image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.43 Operation: Noise-Gaussian DDR5-4800 DDR5-6000 70 140 210 280 350 SE +/- 0.33, N = 3 SE +/- 0.58, N = 3 314 313 1. (CC) gcc options: -fopenmp -O2 -ltiff -ljbig -lsharpyuv -lwebp -lwebpmux -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -lxml2 -lzstd -llzma -lbz2 -lz -lm -lpthread -lgomp
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.43 Operation: Enhanced DDR5-4800 DDR5-6000 100 200 300 400 500 SE +/- 0.58, N = 3 SE +/- 0.33, N = 3 457 455 1. (CC) gcc options: -fopenmp -O2 -ltiff -ljbig -lsharpyuv -lwebp -lwebpmux -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -lxml2 -lzstd -llzma -lbz2 -lz -lm -lpthread -lgomp
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.43 Operation: Sharpen DDR5-4800 DDR5-6000 90 180 270 360 450 SE +/- 0.67, N = 3 SE +/- 0.33, N = 3 406 403 1. (CC) gcc options: -fopenmp -O2 -ltiff -ljbig -lsharpyuv -lwebp -lwebpmux -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -lxml2 -lzstd -llzma -lbz2 -lz -lm -lpthread -lgomp
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.43 Operation: Swirl DDR5-4800 DDR5-6000 200 400 600 800 1000 SE +/- 1.53, N = 3 SE +/- 0.33, N = 3 865 851 1. (CC) gcc options: -fopenmp -O2 -ltiff -ljbig -lsharpyuv -lwebp -lwebpmux -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -lxml2 -lzstd -llzma -lbz2 -lz -lm -lpthread -lgomp
GROMACS The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2024 Implementation: MPI CPU - Input: water_GMX50_bare DDR5-4800 DDR5-6000 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 22.51 22.73 1. (CXX) g++ options: -O3 -lm
OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: bcrypt DDR5-4800 DDR5-6000 70K 140K 210K 280K 350K SE +/- 44.46, N = 3 SE +/- 45.32, N = 3 322406 323000 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt -lbz2
OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: WPA PSK DDR5-4800 DDR5-6000 300K 600K 900K 1200K 1500K SE +/- 1527.53, N = 3 SE +/- 1201.85, N = 3 1360000 1361333 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt -lbz2
Kvazaar This is a test of Kvazaar as a CPU-based H.265/HEVC video encoder written in the C programming language and optimized in Assembly. Kvazaar is the winner of the 2016 ACM Open-Source Software Competition and developed at the Ultra Video Group, Tampere University, Finland. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Slow DDR5-4800 DDR5-6000 12 24 36 48 60 SE +/- 0.09, N = 3 SE +/- 0.06, N = 5 53.43 53.42 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Medium DDR5-4800 DDR5-6000 12 24 36 48 60 SE +/- 0.01, N = 3 SE +/- 0.01, N = 5 53.99 53.95 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Very Fast DDR5-4800 DDR5-6000 20 40 60 80 100 SE +/- 1.24, N = 3 SE +/- 0.49, N = 7 94.71 96.81 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 1.0 Encoder Speed: 6, Lossless DDR5-4800 DDR5-6000 0.9576 1.9152 2.8728 3.8304 4.788 SE +/- 0.010, N = 3 SE +/- 0.006, N = 8 4.256 4.221 1. (CXX) g++ options: -O3 -fPIC -lm
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 1.0 Encoder Speed: 10, Lossless DDR5-4800 DDR5-6000 0.7783 1.5566 2.3349 3.1132 3.8915 SE +/- 0.005, N = 3 SE +/- 0.003, N = 9 3.452 3.459 1. (CXX) g++ options: -O3 -fPIC -lm
Liquid-DSP LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 1 - Buffer Length: 256 - Filter Length: 32 DDR5-4800 DDR5-6000 9M 18M 27M 36M 45M SE +/- 52003.21, N = 3 SE +/- 3511.88, N = 3 44063000 44054000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 1 - Buffer Length: 256 - Filter Length: 57 DDR5-4800 DDR5-6000 15M 30M 45M 60M 75M SE +/- 65397.08, N = 3 SE +/- 98289.26, N = 3 68321333 68181667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 1 - Buffer Length: 256 - Filter Length: 512 DDR5-4800 DDR5-6000 5M 10M 15M 20M 25M SE +/- 333.33, N = 3 SE +/- 13860.42, N = 3 23164333 23093667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 32 DDR5-4800 DDR5-6000 600M 1200M 1800M 2400M 3000M SE +/- 352766.84, N = 3 SE +/- 233333.33, N = 3 2783133333 2784566667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 57 DDR5-4800 DDR5-6000 700M 1400M 2100M 2800M 3500M SE +/- 693621.73, N = 3 SE +/- 5392896.56, N = 3 3298933333 3299400000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 512 DDR5-4800 DDR5-6000 300M 600M 900M 1200M 1500M SE +/- 1386041.53, N = 3 SE +/- 1960442.13, N = 3 1446266667 1442500000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 128 - Buffer Length: 256 - Filter Length: 32 DDR5-4800 DDR5-6000 1200M 2400M 3600M 4800M 6000M SE +/- 166666.67, N = 3 SE +/- 9856131.76, N = 3 5424133333 5407600000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 128 - Buffer Length: 256 - Filter Length: 57 DDR5-4800 DDR5-6000 1300M 2600M 3900M 5200M 6500M SE +/- 19352605.34, N = 3 SE +/- 20311928.62, N = 3 5834800000 5876566667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 128 - Buffer Length: 256 - Filter Length: 512 DDR5-4800 DDR5-6000 500M 1000M 1500M 2000M 2500M SE +/- 1474599.76, N = 3 SE +/- 1305118.13, N = 3 2484533333 2487400000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 256 - Buffer Length: 256 - Filter Length: 32 DDR5-4800 DDR5-6000 2000M 4000M 6000M 8000M 10000M SE +/- 24860209.17, N = 3 SE +/- 14339494.80, N = 3 8463100000 8494733333 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 256 - Buffer Length: 256 - Filter Length: 57 DDR5-4800 DDR5-6000 2000M 4000M 6000M 8000M 10000M SE +/- 3233333.33, N = 3 SE +/- 6331139.97, N = 3 8196166667 8151900000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 256 - Buffer Length: 256 - Filter Length: 512 DDR5-4800 DDR5-6000 600M 1200M 1800M 2400M 3000M SE +/- 1059874.21, N = 3 SE +/- 3925274.23, N = 3 2855600000 2855466667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Llamafile Mozilla's Llamafile allows distributing and running large language models (LLMs) as a single file. Llamafile aims to make open-source LLMs more accessible to developers and users. Llamafile supports a variety of models, CPUs and GPUs, and other options. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.7 Test: llava-v1.5-7b-q4 - Acceleration: CPU DDR5-4800 DDR5-6000 7 14 21 28 35 SE +/- 0.23, N = 3 SE +/- 0.32, N = 5 29.54 30.01
OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.7 Test: wizardcoder-python-34b-v1.0.Q6_K - Acceleration: CPU DDR5-4800 DDR5-6000 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.06, N = 3 9.71 10.64
LuxCoreRender LuxCoreRender is an open-source 3D physically based renderer formerly known as LuxRender. LuxCoreRender supports CPU-based rendering as well as GPU acceleration via OpenCL, NVIDIA CUDA, and NVIDIA OptiX interfaces. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.6 Scene: DLSC - Acceleration: CPU DDR5-4800 DDR5-6000 5 10 15 20 25 SE +/- 0.38, N = 15 SE +/- 0.33, N = 15 21.20 20.94 MIN: 19.48 / MAX: 27.07 MIN: 19.5 / MAX: 27
OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.6 Scene: LuxCore Benchmark - Acceleration: CPU DDR5-4800 DDR5-6000 4 8 12 16 20 SE +/- 0.15, N = 15 SE +/- 0.14, N = 15 13.93 13.96 MIN: 6.46 / MAX: 16.83 MIN: 6.55 / MAX: 16.95
OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.6 Scene: Orange Juice - Acceleration: CPU DDR5-4800 DDR5-6000 7 14 21 28 35 SE +/- 0.41, N = 15 SE +/- 0.52, N = 15 30.83 32.02 MIN: 26.09 / MAX: 42.8 MIN: 26.48 / MAX: 43.04
Memcached Memcached is a high performance, distributed memory object caching system. This Memcached test profiles makes use of memtier_benchmark for excuting this CPU/memory-focused server benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.19 Set To Get Ratio: 1:100 DDR5-4800 DDR5-6000 3M 6M 9M 12M 15M SE +/- 14934.26, N = 3 SE +/- 136953.60, N = 3 13455536.72 13316660.10 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
miniBUDE MiniBUDE is a mini application for the the core computation of the Bristol University Docking Engine (BUDE). This test profile currently makes use of the OpenMP implementation of miniBUDE for CPU benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFInst/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM1 DDR5-4800 DDR5-6000 2K 4K 6K 8K 10K SE +/- 84.72, N = 15 SE +/- 66.65, N = 11 9597.53 9647.50 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
OpenBenchmarking.org Billion Interactions/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM1 DDR5-4800 DDR5-6000 80 160 240 320 400 SE +/- 3.39, N = 15 SE +/- 2.67, N = 11 383.90 385.90 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
OpenBenchmarking.org GFInst/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM2 DDR5-4800 DDR5-6000 2K 4K 6K 8K 10K SE +/- 75.53, N = 3 SE +/- 76.86, N = 4 9655.08 9654.58 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
OpenBenchmarking.org Billion Interactions/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM2 DDR5-4800 DDR5-6000 80 160 240 320 400 SE +/- 3.02, N = 3 SE +/- 3.07, N = 4 386.20 386.18 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
NAMD NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD was developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ns/day, More Is Better NAMD 3.0b6 Input: ATPase with 327,506 Atoms DDR5-4800 DDR5-6000 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.03, N = 7 14.04 14.24
OpenBenchmarking.org ns/day, More Is Better NAMD 3.0b6 Input: STMV with 1,066,628 Atoms DDR5-4800 DDR5-6000 1.0401 2.0802 3.1203 4.1604 5.2005 SE +/- 0.01366, N = 3 SE +/- 0.00697, N = 3 4.54707 4.62277
NAS Parallel Benchmarks NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: LU.C DDR5-4800 DDR5-6000 80K 160K 240K 320K 400K SE +/- 5209.32, N = 12 SE +/- 5529.34, N = 15 383603.91 385558.53 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: SP.C DDR5-4800 DDR5-6000 40K 80K 120K 160K 200K SE +/- 1358.70, N = 3 SE +/- 878.07, N = 5 172595.77 194853.91 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: IS.D DDR5-4800 DDR5-6000 2K 4K 6K 8K 10K SE +/- 34.68, N = 3 SE +/- 55.66, N = 6 7995.28 8647.62 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: MG.C DDR5-4800 DDR5-6000 40K 80K 120K 160K 200K SE +/- 239.36, N = 3 SE +/- 1302.88, N = 15 149191.03 167026.93 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: CG.C DDR5-4800 DDR5-6000 20K 40K 60K 80K 100K SE +/- 1060.91, N = 3 SE +/- 618.06, N = 15 88719.53 88208.36 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
NWChem NWChem is an open-source high performance computational chemistry package. Per NWChem's documentation, "NWChem aims to provide its users with computational chemistry tools that are scalable both in their ability to treat large scientific computational chemistry problems efficiently, and in their use of available parallel computing resources from high-performance parallel supercomputers to conventional workstation clusters." Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better NWChem 7.0.2 Input: C240 Buckyball DDR5-4800 DDR5-6000 300 600 900 1200 1500 1328.9 1326.2 1. (F9X) gfortran options: -lnwctask -lccsd -lmcscf -lselci -lmp2 -lmoints -lstepper -ldriver -loptim -lnwdft -lgradients -lcphf -lesp -lddscf -ldangchang -lguess -lhessian -lvib -lnwcutil -lrimp2 -lproperty -lsolvation -lnwints -lprepar -lnwmd -lnwpw -lofpw -lpaw -lpspw -lband -lnwpwlib -lcafe -lspace -lanalyze -lqhop -lpfft -ldplot -ldrdy -lvscf -lqmmm -lqmd -letrans -ltce -lbq -lmm -lcons -lperfm -ldntmc -lccca -ldimqm -lga -larmci -lpeigs -l64to32 -lopenblas -lpthread -lrt -llapack -lnwcblas -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz -lcomex -m64 -ffast-math -std=legacy -fdefault-integer-8 -finline-functions -O2
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Deconvolution Batch shapes_3d - Engine: CPU DDR5-4800 DDR5-6000 0.1155 0.231 0.3465 0.462 0.5775 SE +/- 0.000838, N = 3 SE +/- 0.004859, N = 15 0.507850 0.513417 MIN: 0.48 MIN: 0.48 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
ONNX Runtime ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Model Zoo. Learn more via the OpenBenchmarking.org test page.
OpenFOAM OpenFOAM is the leading free, open-source software for computational fluid dynamics (CFD). This test profile currently uses the drivaerFastback test case for analyzing automotive aerodynamics or alternatively the older motorBike input. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Mesh Time DDR5-4800 DDR5-6000 5 10 15 20 25 22.90 21.44 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm
OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Execution Time DDR5-4800 DDR5-6000 5 10 15 20 25 20.55 20.60 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm
OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Medium Mesh Size - Execution Time DDR5-4800 DDR5-6000 40 80 120 160 200 177.11 160.37 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm
OpenRadioss OpenRadioss is an open-source AGPL-licensed finite element solver for dynamic event analysis OpenRadioss is based on Altair Radioss and open-sourced in 2022. This open-source finite element solver is benchmarked with various example models available from https://www.openradioss.org/models/ and https://github.com/OpenRadioss/ModelExchange/tree/main/Examples. This test is currently using a reference OpenRadioss binary build offered via GitHub. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2023.09.15 Model: Cell Phone Drop Test DDR5-4800 DDR5-6000 4 8 12 16 20 SE +/- 0.12, N = 3 SE +/- 0.16, N = 3 17.74 17.87
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org sign/s, More Is Better OpenSSL 3.3 Algorithm: RSA4096 DDR5-4800 DDR5-6000 15K 30K 45K 60K 75K SE +/- 80.51, N = 3 SE +/- 15.58, N = 3 69052.5 69037.7 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org verify/s, More Is Better OpenSSL 3.3 Algorithm: RSA4096 DDR5-4800 DDR5-6000 600K 1200K 1800K 2400K 3000K SE +/- 282.94, N = 3 SE +/- 331.00, N = 3 2757741.0 2755042.3 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: SHA256 DDR5-4800 DDR5-6000 40000M 80000M 120000M 160000M 200000M SE +/- 316551135.30, N = 3 SE +/- 76407988.26, N = 3 186789819953 186660393313 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: SHA512 DDR5-4800 DDR5-6000 16000M 32000M 48000M 64000M 80000M SE +/- 495890924.48, N = 3 SE +/- 163780192.60, N = 3 72936235753 72359767240 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: AES-128-GCM DDR5-4800 DDR5-6000 400000M 800000M 1200000M 1600000M 2000000M SE +/- 798384586.01, N = 3 SE +/- 679943093.73, N = 3 2003313098753 2007365351470 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: AES-256-GCM DDR5-4800 DDR5-6000 400000M 800000M 1200000M 1600000M 2000000M SE +/- 686739707.21, N = 3 SE +/- 2482404044.78, N = 3 1839699436523 1837765850747 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: ChaCha20 DDR5-4800 DDR5-6000 300000M 600000M 900000M 1200000M 1500000M SE +/- 227575784.55, N = 3 SE +/- 507390352.95, N = 3 1187315147937 1186359294450 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: ChaCha20-Poly1305 DDR5-4800 DDR5-6000 200000M 400000M 600000M 800000M 1000000M SE +/- 318743927.24, N = 3 SE +/- 89801965.81, N = 3 807588425637 807009841427 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenVINO This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection FP16-INT8 - Device: CPU DDR5-4800 DDR5-6000 40 80 120 160 200 SE +/- 0.21, N = 3 SE +/- 0.32, N = 3 194.14 193.80 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection FP16-INT8 - Device: CPU DDR5-4800 DDR5-6000 70 140 210 280 350 SE +/- 0.37, N = 3 SE +/- 0.54, N = 3 328.94 329.53 MIN: 143.56 / MAX: 354.17 MIN: 165.13 / MAX: 354.63 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU DDR5-4800 DDR5-6000 40K 80K 120K 160K 200K SE +/- 417.74, N = 3 SE +/- 1255.01, N = 3 192115.85 190777.83 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU DDR5-4800 DDR5-6000 0.108 0.216 0.324 0.432 0.54 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 0.48 0.48 MIN: 0.13 / MAX: 26.21 MIN: 0.13 / MAX: 26.24 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Detection FP16 - Device: CPU DDR5-4800 DDR5-6000 160 320 480 640 800 SE +/- 1.38, N = 3 SE +/- 4.77, N = 15 674.46 733.80 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Detection FP16 - Device: CPU DDR5-4800 DDR5-6000 20 40 60 80 100 SE +/- 0.19, N = 3 SE +/- 0.54, N = 15 94.77 87.14 MIN: 40.52 / MAX: 153.33 MIN: 35.35 / MAX: 200.15 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16-INT8 - Device: CPU DDR5-4800 DDR5-6000 4K 8K 12K 16K 20K SE +/- 7.25, N = 3 SE +/- 13.14, N = 3 18679.63 18679.36 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16-INT8 - Device: CPU DDR5-4800 DDR5-6000 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 6.61 6.61 MIN: 2.24 / MAX: 27.98 MIN: 2.25 / MAX: 28.15 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16-INT8 - Device: CPU DDR5-4800 DDR5-6000 2K 4K 6K 8K 10K SE +/- 5.50, N = 3 SE +/- 2.97, N = 3 11196.67 11232.26 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16-INT8 - Device: CPU DDR5-4800 DDR5-6000 1.2668 2.5336 3.8004 5.0672 6.334 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 5.63 5.62 MIN: 2.39 / MAX: 30.13 MIN: 2.01 / MAX: 29.21 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Vehicle Bike Detection FP16 - Device: CPU DDR5-4800 DDR5-6000 2K 4K 6K 8K 10K SE +/- 7.28, N = 3 SE +/- 8.58, N = 3 8681.62 8730.88 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Vehicle Bike Detection FP16 - Device: CPU DDR5-4800 DDR5-6000 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 7.27 7.22 MIN: 4.1 / MAX: 24.92 MIN: 4.48 / MAX: 26.34 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Machine Translation EN To DE FP16 - Device: CPU DDR5-4800 DDR5-6000 200 400 600 800 1000 SE +/- 6.77, N = 3 SE +/- 2.99, N = 3 1043.31 1109.75 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Machine Translation EN To DE FP16 - Device: CPU DDR5-4800 DDR5-6000 14 28 42 56 70 SE +/- 0.40, N = 3 SE +/- 0.16, N = 3 61.24 57.57 MIN: 27.85 / MAX: 106.26 MIN: 34.99 / MAX: 106.02 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16-INT8 - Device: CPU DDR5-4800 DDR5-6000 7K 14K 21K 28K 35K SE +/- 17.88, N = 3 SE +/- 0.83, N = 3 31630.96 31655.57 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16-INT8 - Device: CPU DDR5-4800 DDR5-6000 0.873 1.746 2.619 3.492 4.365 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 3.87 3.88 MIN: 1.54 / MAX: 25.35 MIN: 1.55 / MAX: 24.23 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16-INT8 - Device: CPU DDR5-4800 DDR5-6000 1000 2000 3000 4000 5000 SE +/- 1.24, N = 3 SE +/- 5.09, N = 3 4776.66 4774.33 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16-INT8 - Device: CPU DDR5-4800 DDR5-6000 6 12 18 24 30 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 26.65 26.67 MIN: 15.32 / MAX: 48.24 MIN: 15.22 / MAX: 47.89 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU DDR5-4800 DDR5-6000 700 1400 2100 2800 3500 SE +/- 2.32, N = 3 SE +/- 2.87, N = 3 3224.72 3287.43 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU DDR5-4800 DDR5-6000 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 19.73 19.36 MIN: 9.5 / MAX: 45.5 MIN: 9.99 / MAX: 44.41 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Re-Identification Retail FP16 - Device: CPU DDR5-4800 DDR5-6000 3K 6K 9K 12K 15K SE +/- 2.75, N = 3 SE +/- 14.83, N = 3 13918.74 13949.89 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Re-Identification Retail FP16 - Device: CPU DDR5-4800 DDR5-6000 1.0103 2.0206 3.0309 4.0412 5.0515 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 4.49 4.48 MIN: 1.96 / MAX: 20.97 MIN: 1.99 / MAX: 22.31 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Noise Suppression Poconet-Like FP16 - Device: CPU DDR5-4800 DDR5-6000 2K 4K 6K 8K 10K SE +/- 39.39, N = 3 SE +/- 33.49, N = 3 9833.47 10557.16 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Noise Suppression Poconet-Like FP16 - Device: CPU DDR5-4800 DDR5-6000 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 11.09 10.73 MIN: 6.01 / MAX: 31.79 MIN: 6.4 / MAX: 32.23 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVKL OpenVKL is the Intel Open Volume Kernel Library that offers high-performance volume computation kernels and part of the Intel oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Items / Sec, More Is Better OpenVKL 2.0.0 Benchmark: vklBenchmarkCPU ISPC DDR5-4800 DDR5-6000 800 1600 2400 3200 4000 SE +/- 1.45, N = 3 SE +/- 0.88, N = 3 3614 3654 MIN: 293 / MAX: 42496 MIN: 293 / MAX: 42376
OSPRay Intel OSPRay is a portable ray-tracing engine for high-performance, high-fidelity scientific visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.1 Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time DDR5-4800 DDR5-6000 12 24 36 48 60 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 54.96 55.18
OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.1 Benchmark: particle_volume/ao/real_time DDR5-4800 DDR5-6000 12 24 36 48 60 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 54.38 54.39
OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.1 Benchmark: particle_volume/scivis/real_time DDR5-4800 DDR5-6000 12 24 36 48 60 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 54.57 54.40
OSPRay Studio Intel OSPRay Studio is an open-source, interactive visualization and ray-tracing software package. OSPRay Studio makes use of Intel OSPRay, a portable ray-tracing engine for high-performance, high-fidelity visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU DDR5-4800 DDR5-6000 140 280 420 560 700 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 642 639
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU DDR5-4800 DDR5-6000 2K 4K 6K 8K 10K SE +/- 7.81, N = 3 SE +/- 3.71, N = 3 10200 10154
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU DDR5-4800 DDR5-6000 4K 8K 12K 16K 20K SE +/- 13.58, N = 3 SE +/- 39.68, N = 3 20428 20379
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU DDR5-4800 DDR5-6000 160 320 480 640 800 SE +/- 0.67, N = 3 SE +/- 0.58, N = 3 758 753
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU DDR5-4800 DDR5-6000 3K 6K 9K 12K 15K SE +/- 11.86, N = 3 SE +/- 14.19, N = 3 12063 11982
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU DDR5-4800 DDR5-6000 5K 10K 15K 20K 25K SE +/- 51.91, N = 3 SE +/- 32.92, N = 3 24131 23981
PostgreSQL This is a benchmark of PostgreSQL using the integrated pgbench for facilitating the database benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org TPS, More Is Better PostgreSQL 16 Scaling Factor: 100 - Clients: 1000 - Mode: Read Write DDR5-4800 DDR5-6000 30K 60K 90K 120K 150K SE +/- 154.18, N = 3 SE +/- 68.35, N = 3 127310 126334 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 16 Scaling Factor: 100 - Clients: 1000 - Mode: Read Write - Average Latency DDR5-4800 DDR5-6000 2 4 6 8 10 SE +/- 0.010, N = 3 SE +/- 0.004, N = 3 7.855 7.916 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
OpenBenchmarking.org TPS, More Is Better PostgreSQL 16 Scaling Factor: 100 - Clients: 1000 - Mode: Read Only DDR5-4800 DDR5-6000 1.2M 2.4M 3.6M 4.8M 6M SE +/- 27253.44, N = 3 SE +/- 39392.13, N = 3 5364577 5418672 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 16 Scaling Factor: 100 - Clients: 1000 - Mode: Read Only - Average Latency DDR5-4800 DDR5-6000 0.0421 0.0842 0.1263 0.1684 0.2105 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 0.187 0.185 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
PyBench This test profile reports the total time of the different average timed test results from PyBench. PyBench reports average test times for different functions such as BuiltinFunctionCalls and NestedForLoops, with this total result providing a rough estimate as to Python's average performance on a given system. This test profile runs PyBench each time for 20 rounds. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Milliseconds, Fewer Is Better PyBench 2018-02-16 Total For Average Test Times DDR5-4800 DDR5-6000 130 260 390 520 650 SE +/- 3.93, N = 3 SE +/- 1.70, N = 4 582 582
OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 1 - Model: ResNet-152 DDR5-4800 DDR5-6000 5 10 15 20 25 SE +/- 0.12, N = 3 SE +/- 0.07, N = 3 20.47 20.24 MIN: 19.75 / MAX: 20.95 MIN: 19.69 / MAX: 20.76
OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 64 - Model: ResNet-50 DDR5-4800 DDR5-6000 10 20 30 40 50 SE +/- 0.05, N = 3 SE +/- 0.27, N = 3 43.07 43.68 MIN: 41.33 / MAX: 43.96 MIN: 41.7 / MAX: 44.98
OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 64 - Model: ResNet-152 DDR5-4800 DDR5-6000 4 8 12 16 20 SE +/- 0.16, N = 7 SE +/- 0.12, N = 3 17.77 17.46 MIN: 16.6 / MAX: 18.47 MIN: 16.79 / MAX: 17.99
OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 256 - Model: ResNet-50 DDR5-4800 DDR5-6000 10 20 30 40 50 SE +/- 0.36, N = 3 SE +/- 0.29, N = 3 43.24 43.61 MIN: 41.12 / MAX: 44.61 MIN: 41.9 / MAX: 45.08
OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 256 - Model: ResNet-152 DDR5-4800 DDR5-6000 4 8 12 16 20 SE +/- 0.14, N = 3 SE +/- 0.19, N = 4 17.58 17.42 MIN: 16.88 / MAX: 18.01 MIN: 16.53 / MAX: 18.16
OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 512 - Model: ResNet-50 DDR5-4800 DDR5-6000 10 20 30 40 50 SE +/- 0.03, N = 3 SE +/- 0.13, N = 3 42.84 43.83 MIN: 41.63 / MAX: 44.31 MIN: 41.6 / MAX: 45.11
OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 512 - Model: ResNet-152 DDR5-4800 DDR5-6000 4 8 12 16 20 SE +/- 0.13, N = 3 SE +/- 0.05, N = 3 17.53 17.49 MIN: 16.64 / MAX: 17.95 MIN: 16.94 / MAX: 18.01
QMCPACK QMCPACK is a modern high-performance open-source Quantum Monte Carlo (QMC) simulation code making use of MPI for this benchmark of the H20 example code. QMCPACK is an open-source production level many-body ab initio Quantum Monte Carlo code for computing the electronic structure of atoms, molecules, and solids. QMCPACK is supported by the U.S. Department of Energy. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 3.17.1 Input: Li2_STO_ae DDR5-4800 DDR5-6000 20 40 60 80 100 SE +/- 0.54, N = 3 SE +/- 0.27, N = 3 78.80 79.02 1. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -march=native -O3 -lm -ldl
QuantLib QuantLib is an open-source library/framework around quantitative finance for modeling, trading and risk management scenarios. QuantLib is written in C++ with Boost and its built-in benchmark used reports the QuantLib Benchmark Index benchmark score. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.32 Configuration: Single-Threaded DDR5-4800 DDR5-6000 900 1800 2700 3600 4500 SE +/- 45.37, N = 3 SE +/- 40.37, N = 3 4252.2 4264.7 1. (CXX) g++ options: -O3 -march=native -fPIE -pie
OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.32 Configuration: Multi-Threaded DDR5-4800 DDR5-6000 100K 200K 300K 400K 500K SE +/- 649.08, N = 3 SE +/- 401.79, N = 3 464193.2 465436.8 1. (CXX) g++ options: -O3 -march=native -fPIE -pie
RocksDB This is a benchmark of Meta/Facebook's RocksDB as an embeddable persistent key-value store for fast storage based on Google's LevelDB. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Random Read DDR5-4800 DDR5-6000 200M 400M 600M 800M 1000M SE +/- 225348.54, N = 3 SE +/- 5260460.78, N = 3 793390385 786896580 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Read While Writing DDR5-4800 DDR5-6000 4M 8M 12M 16M 20M SE +/- 233412.71, N = 15 SE +/- 241581.30, N = 15 16817111 17063021 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
SecureMark SecureMark is an objective, standardized benchmarking framework for measuring the efficiency of cryptographic processing solutions developed by EEMBC. SecureMark-TLS is benchmarking Transport Layer Security performance with a focus on IoT/edge computing. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org marks, More Is Better SecureMark 1.0.4 Benchmark: SecureMark-TLS DDR5-4800 DDR5-6000 90K 180K 270K 360K 450K SE +/- 2301.56, N = 3 SE +/- 809.78, N = 3 407759 405454 1. (CC) gcc options: -pedantic -O3
Speedb Speedb is a next-generation key value storage engine that is RocksDB compatible and aiming for stability, efficiency, and performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Random Read DDR5-4800 DDR5-6000 200M 400M 600M 800M 1000M SE +/- 4175835.13, N = 3 SE +/- 707549.54, N = 3 818556768 822102828 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Read While Writing DDR5-4800 DDR5-6000 4M 8M 12M 16M 20M SE +/- 78558.58, N = 3 SE +/- 433421.14, N = 12 18826431 18687897 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
srsRAN Project srsRAN Project is a complete ORAN-native 5G RAN solution created by Software Radio Systems (SRS). The srsRAN Project radio suite was formerly known as srsLTE and can be used for building your own software-defined radio (SDR) 4G/5G mobile network. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240325 Test: PUSCH Processor Benchmark, Throughput Total DDR5-4800 DDR5-6000 2K 4K 6K 8K 10K SE +/- 89.89, N = 15 SE +/- 0.32, N = 3 8056.3 7954.2 MIN: 5096.4 / MAX: 8588.6 MIN: 5432.6 / MAX: 7954.6 1. (CXX) g++ options: -O3 -march=native -mavx2 -mavx -msse4.1 -mfma -mavx512f -mavx512cd -mavx512bw -mavx512dq -fno-trapping-math -fno-math-errno -ldl
OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240325 Test: PUSCH Processor Benchmark, Throughput Thread DDR5-4800 DDR5-6000 40 80 120 160 200 SE +/- 0.00, N = 3 SE +/- 0.00, N = 4 183.3 183.3 MIN: 105.8 MIN: 105.8 1. (CXX) g++ options: -O3 -march=native -mavx2 -mavx -msse4.1 -mfma -mavx512f -mavx512cd -mavx512bw -mavx512dq -fno-trapping-math -fno-math-errno -ldl
OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240325 Test: PDSCH Processor Benchmark, Throughput Total DDR5-4800 DDR5-6000 4K 8K 12K 16K 20K SE +/- 180.03, N = 3 SE +/- 214.06, N = 3 19592.6 19454.1 1. (CXX) g++ options: -O3 -march=native -mavx2 -mavx -msse4.1 -mfma -mavx512f -mavx512cd -mavx512bw -mavx512dq -fno-trapping-math -fno-math-errno -ldl
OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240325 Test: PDSCH Processor Benchmark, Throughput Thread DDR5-4800 DDR5-6000 200 400 600 800 1000 SE +/- 10.21, N = 5 SE +/- 2.19, N = 9 990.4 994.0 1. (CXX) g++ options: -O3 -march=native -mavx2 -mavx -msse4.1 -mfma -mavx512f -mavx512cd -mavx512bw -mavx512dq -fno-trapping-math -fno-math-errno -ldl
Stockfish This is a test of Stockfish, an advanced open-source C++11 chess benchmark that can scale up to 1024 CPU threads. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 16.1 Chess Benchmark DDR5-4800 DDR5-6000 70M 140M 210M 280M 350M SE +/- 1906957.54, N = 3 SE +/- 4338134.07, N = 15 308692581 307343923 1. (CXX) g++ options: -lgcov -m64 -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -funroll-loops -msse -msse3 -mpopcnt -mavx2 -mbmi -mavx512f -mavx512bw -mavx512vnni -mavx512dq -mavx512vl -msse4.1 -mssse3 -msse2 -mbmi2 -flto -flto-partition=one -flto=jobserver
SVT-AV1 This is a benchmark of the SVT-AV1 open-source video encoder/decoder. SVT-AV1 was originally developed by Intel as part of their Open Visual Cloud / Scalable Video Technology (SVT). Development of SVT-AV1 has since moved to the Alliance for Open Media as part of upstream AV1 development. SVT-AV1 is a CPU-based multi-threaded video encoder for the AV1 video format with a sample YUV video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 13 - Input: Bosphorus 4K DDR5-4800 DDR5-6000 60 120 180 240 300 SE +/- 9.35, N = 12 SE +/- 7.58, N = 15 265.85 271.92 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 12 - Input: Bosphorus 4K DDR5-4800 DDR5-6000 60 120 180 240 300 SE +/- 2.96, N = 5 SE +/- 2.17, N = 9 274.07 281.59 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 8 - Input: Bosphorus 4K DDR5-4800 DDR5-6000 30 60 90 120 150 SE +/- 1.05, N = 15 SE +/- 1.10, N = 7 117.85 118.16 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 4 - Input: Bosphorus 4K DDR5-4800 DDR5-6000 3 6 9 12 15 SE +/- 0.05, N = 3 SE +/- 0.04, N = 4 11.26 11.28 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
TensorFlow This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 1 - Model: ResNet-50 DDR5-4800 DDR5-6000 2 4 6 8 10 SE +/- 0.10, N = 15 SE +/- 0.01, N = 3 6.96 7.11
OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 64 - Model: ResNet-50 DDR5-4800 DDR5-6000 30 60 90 120 150 SE +/- 1.56, N = 4 SE +/- 0.26, N = 3 134.50 137.62
OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 64 - Model: AlexNet DDR5-4800 DDR5-6000 200 400 600 800 1000 SE +/- 11.67, N = 3 SE +/- 3.83, N = 6 1071.51 1086.48
OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 64 - Model: GoogLeNet DDR5-4800 DDR5-6000 80 160 240 320 400 SE +/- 3.12, N = 8 SE +/- 2.87, N = 3 355.14 361.38
OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 256 - Model: ResNet-50 DDR5-4800 DDR5-6000 40 80 120 160 200 SE +/- 0.77, N = 3 SE +/- 0.65, N = 3 195.89 203.33
OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 256 - Model: AlexNet DDR5-4800 DDR5-6000 500 1000 1500 2000 2500 SE +/- 16.20, N = 3 SE +/- 2.78, N = 4 2223.51 2271.95
OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 256 - Model: GoogLeNet DDR5-4800 DDR5-6000 140 280 420 560 700 SE +/- 6.10, N = 3 SE +/- 7.40, N = 3 641.12 659.11
OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 512 - Model: ResNet-50 DDR5-4800 DDR5-6000 50 100 150 200 250 SE +/- 0.47, N = 3 SE +/- 0.52, N = 3 231.71 246.24
OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 512 - Model: AlexNet DDR5-4800 DDR5-6000 600 1200 1800 2400 3000 SE +/- 3.62, N = 3 SE +/- 2.64, N = 3 2644.59 2665.92
OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 512 - Model: GoogLeNet DDR5-4800 DDR5-6000 200 400 600 800 1000 SE +/- 3.88, N = 3 SE +/- 3.19, N = 3 780.73 819.05
OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Medium DDR5-4800 DDR5-6000 10 20 30 40 50 SE +/- 0.05, N = 3 SE +/- 0.04, N = 4 43.96 43.85
OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Very Fast DDR5-4800 DDR5-6000 20 40 60 80 100 SE +/- 0.27, N = 3 SE +/- 0.14, N = 6 76.35 78.23
VVenC VVenC is the Fraunhofer Versatile Video Encoder as a fast/efficient H.266/VVC encoder. The vvenc encoder makes use of SIMD Everywhere (SIMDe). The vvenc software is published under the Clear BSD License. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.11 Video Input: Bosphorus 4K - Video Preset: Fast DDR5-4800 DDR5-6000 3 6 9 12 15 SE +/- 0.040, N = 3 SE +/- 0.028, N = 3 9.719 9.810 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.11 Video Input: Bosphorus 4K - Video Preset: Faster DDR5-4800 DDR5-6000 6 12 18 24 30 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 22.88 23.74 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.2.4 Encode Settings: Quality 100, Lossless, Highest Compression DDR5-4800 DDR5-6000 0.153 0.306 0.459 0.612 0.765 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.67 0.68 1. (CC) gcc options: -fvisibility=hidden -O2 -lm
WRF WRF, the Weather Research and Forecasting Model, is a "next-generation mesoscale numerical weather prediction system designed for both atmospheric research and operational forecasting applications. It features two dynamical cores, a data assimilation system, and a software architecture supporting parallel computation and system extensibility." Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better WRF 4.2.2 Input: conus 2.5km DDR5-4800 DDR5-6000 1300 2600 3900 5200 6500 6206.69 5574.36 1. (F9X) gfortran options: -O2 -ftree-vectorize -funroll-loops -ffree-form -fconvert=big-endian -frecord-marker=4 -fallow-invalid-boz -lesmf_time -lwrfio_nf -lnetcdff -lnetcdf -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
x265 This is a simple test of the x265 encoder run on the CPU with 1080p and 4K options for H.265 video encode performance with x265. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better x265 3.6 Video Input: Bosphorus 4K DDR5-4800 DDR5-6000 8 16 24 32 40 SE +/- 0.08, N = 3 SE +/- 0.11, N = 3 34.08 34.21 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Xcompact3d Incompact3d Xcompact3d Incompact3d is a Fortran-MPI based, finite difference high-performance code for solving the incompressible Navier-Stokes equation and as many as you need scalar transport equations. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 193 Cells Per Direction DDR5-4800 DDR5-6000 2 4 6 8 10 SE +/- 0.01024398, N = 3 SE +/- 0.01274608, N = 6 6.79525073 6.47804538 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: X3D-benchmarking input.i3d DDR5-4800 DDR5-6000 50 100 150 200 250 SE +/- 0.04, N = 3 SE +/- 0.78, N = 3 222.62 199.84 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Xmrig Xmrig is an open-source cross-platform CPU/GPU miner for RandomX, KawPow, CryptoNight and AstroBWT. This test profile is setup to measure the Xmrig CPU mining performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org H/s, More Is Better Xmrig 6.21 Variant: GhostRider - Hash Count: 1M DDR5-4800 DDR5-6000 4K 8K 12K 16K 20K SE +/- 997.93, N = 15 SE +/- 912.79, N = 15 18386.4 19873.0 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
Zstd Compression This test measures the time needed to compress/decompress a sample file (silesia.tar) using Zstd (Zstandard) compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Compression Speed DDR5-4800 DDR5-6000 5 10 15 20 25 SE +/- 0.06, N = 3 SE +/- 0.12, N = 3 21.4 21.8 1. (CC) gcc options: -O3 -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Decompression Speed DDR5-4800 DDR5-6000 400 800 1200 1600 2000 SE +/- 6.51, N = 3 SE +/- 8.42, N = 3 1721.5 1726.4 1. (CC) gcc options: -O3 -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Compression Speed DDR5-4800 DDR5-6000 3 6 9 12 15 SE +/- 0.06, N = 3 SE +/- 0.00, N = 3 10.2 10.2 1. (CC) gcc options: -O3 -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Decompression Speed DDR5-4800 DDR5-6000 400 800 1200 1600 2000 SE +/- 4.80, N = 3 SE +/- 1.50, N = 3 1627.7 1636.4 1. (CC) gcc options: -O3 -pthread -lz -llzma
DDR5-4800 Processor: AMD EPYC 9755 128-Core @ 2.70GHz (128 Cores / 256 Threads), Motherboard: AMD VOLCANO (RVOT1000D BIOS), Chipset: AMD Device 153a, Memory: 12 x 64GB DDR5-4800MT/s Samsung M321R8GA0PB1-CCPKC, Disk: 2 x 1920GB KIOXIA KCD8XPUG1T92, Graphics: ASPEED, Network: Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 24.04, Kernel: 6.10.0-phx (x86_64), Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1920x1200
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-OiuXZC/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-OiuXZC/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xb002110Java Notes: OpenJDK Runtime Environment (build 21.0.3-ea+7-Ubuntu-1build1)Python Notes: Python 3.12.2Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 28 September 2024 17:29 by user phoronix.
DDR5-6000 Processor: AMD EPYC 9755 128-Core @ 2.70GHz (128 Cores / 256 Threads), Motherboard: AMD VOLCANO (RVOT1000D BIOS), Chipset: AMD Device 153a, Memory: 12 x 64GB DDR5-6000MT/s Samsung M321R8GA0PB1-CCPKC, Disk: 2 x 1920GB KIOXIA KCD8XPUG1T92, Graphics: ASPEED, Network: Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 24.04, Kernel: 6.10.0-phx (x86_64), Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1920x1200
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-OiuXZC/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-OiuXZC/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xb002110Java Notes: OpenJDK Runtime Environment (build 21.0.3-ea+7-Ubuntu-1build1)Python Notes: Python 3.12.2Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 29 September 2024 15:49 by user phoronix.