2 x Intel Xeon Max 9480 benchmarks for a future article.
Linux 6.8 Processor: 2 x Intel Xeon Max 9480 @ 3.50GHz (112 Cores / 224 Threads), Motherboard: Supermicro SYS-221H-TNR X13DEM v1.10 (1.3 BIOS), Chipset: Intel Device 1bce, Memory: 512GB, Disk: 7682GB INTEL SSDPF2KX076TZ, Graphics: ASPEED, Network: 2 x Broadcom BCM57508 NetXtreme-E 10Gb/25Gb/40Gb/50Gb/100Gb/200Gb
OS: Ubuntu 23.10, Kernel: 6.8.0-060800-generic (x86_64), Desktop: GNOME Shell 45.0, Display Server: X Server 1.21.1.7, Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1024x768
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: intel_cpufreq performance - CPU Microcode: 0x2c000290Python Notes: Python 3.11.6Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected
Linux 6.9-rc2 OS: Ubuntu 23.10, Kernel: 6.9.0-060900rc2-generic (x86_64), Desktop: GNOME Shell 45.0, Display Server: X Server 1.21.1.7, Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1024x768
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: intel_cpufreq performance - CPU Microcode: 0x2c000290Python Notes: Python 3.11.6Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected
MariaDB mariadb-slap OpenBenchmarking.org Queries Per Second, More Is Better MariaDB mariadb-slap 11.5 Clients: 1024 Linux 6.9-rc2 Linux 6.8 8 16 24 32 40 SE +/- 1.41, N = 6 SE +/- 1.23, N = 9 36 34 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O3 -shared -lrt -lpthread -lz -ldl -lm -lstdc++
OpenBenchmarking.org Queries Per Second, More Is Better MariaDB mariadb-slap 11.5 Clients: 512 Linux 6.9-rc2 Linux 6.8 20 40 60 80 100 SE +/- 0.87, N = 9 SE +/- 0.41, N = 3 86 86 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O3 -shared -lrt -lpthread -lz -ldl -lm -lstdc++
Quicksilver Quicksilver is a proxy application that represents some elements of the Mercury workload by solving a simplified dynamic Monte Carlo particle transport problem. Quicksilver is developed by Lawrence Livermore National Laboratory (LLNL) and this test profile currently makes use of the OpenMP CPU threaded code path. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Figure Of Merit, More Is Better Quicksilver 20230818 Input: CORAL2 P2 Linux 6.9-rc2 Linux 6.8 1.4M 2.8M 4.2M 5.6M 7M SE +/- 91964.82, N = 9 SE +/- 74194.56, N = 9 6689667 6677333 1. (CXX) g++ options: -fopenmp -O3 -march=native
OpenBenchmarking.org Figure Of Merit, More Is Better Quicksilver 20230818 Input: CTS2 Linux 6.8 Linux 6.9-rc2 2M 4M 6M 8M 10M SE +/- 97067.86, N = 9 SE +/- 244177.28, N = 7 8425444 8129143 1. (CXX) g++ options: -fopenmp -O3 -march=native
Stockfish OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 16.1 Chess Benchmark Linux 6.9-rc2 Linux 6.8 14M 28M 42M 56M 70M SE +/- 2705883.43, N = 9 SE +/- 2993041.97, N = 9 64443099 61999717 1. (CXX) g++ options: -lgcov -m64 -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -funroll-loops -msse -msse3 -mpopcnt -mavx2 -mbmi -mavx512f -mavx512bw -mavx512vnni -mavx512dq -mavx512vl -msse4.1 -mssse3 -msse2 -mbmi2 -flto -flto-partition=one -flto=jobserver
Llama.cpp Llama.cpp is a port of Facebook's LLaMA model in C/C++ developed by Georgi Gerganov. Llama.cpp allows the inference of LLaMA and other supported models in C/C++. For CPU inference Llama.cpp supports AVX2/AVX-512, ARM NEON, and other modern ISAs along with features like OpenBLAS usage. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b1808 Model: llama-2-70b-chat.Q5_0.gguf Linux 6.9-rc2 Linux 6.8 0.0833 0.1666 0.2499 0.3332 0.4165 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.37 0.36 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas
FFmpeg This is a benchmark of the FFmpeg multimedia framework. The FFmpeg test profile is making use of a modified version of vbench from Columbia University's Architecture and Design Lab (ARCADE) [http://arcade.cs.columbia.edu/vbench/] that is a benchmark for video-as-a-service workloads. The test profile offers the options of a range of vbench scenarios based on freely distributable video content and offers the options of using the x264 or x265 video encoders for transcoding. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FPS, More Is Better FFmpeg 6.1 Encoder: libx265 - Scenario: Platform Linux 6.8 Linux 6.9-rc2 10 20 30 40 50 SE +/- 0.71, N = 12 SE +/- 0.46, N = 12 42.74 40.20 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Blender OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: Barbershop - Compute: CPU-Only Linux 6.8 Linux 6.9-rc2 60 120 180 240 300 SE +/- 6.20, N = 8 SE +/- 6.67, N = 9 271.99 279.74
Speedb Speedb is a next-generation key value storage engine that is RocksDB compatible and aiming for stability, efficiency, and performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Sequential Fill Linux 6.9-rc2 Linux 6.8 70K 140K 210K 280K 350K SE +/- 1955.03, N = 3 SE +/- 1183.38, N = 3 330167 327637 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
FFmpeg This is a benchmark of the FFmpeg multimedia framework. The FFmpeg test profile is making use of a modified version of vbench from Columbia University's Architecture and Design Lab (ARCADE) [http://arcade.cs.columbia.edu/vbench/] that is a benchmark for video-as-a-service workloads. The test profile offers the options of a range of vbench scenarios based on freely distributable video content and offers the options of using the x264 or x265 video encoders for transcoding. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FPS, More Is Better FFmpeg 6.1 Encoder: libx265 - Scenario: Video On Demand Linux 6.8 Linux 6.9-rc2 10 20 30 40 50 SE +/- 0.50, N = 12 SE +/- 0.76, N = 9 44.80 44.24 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
BRL-CAD OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.38.2 VGR Performance Metric Linux 6.9-rc2 Linux 6.8 900K 1800K 2700K 3600K 4500K 3991802 3906839 1. (CXX) g++ options: -std=c++17 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -ltcl8.6 -lnetpbm -lregex_brl -lz_brl -lassimp -ldl -lm -ltk8.6
Llama.cpp Llama.cpp is a port of Facebook's LLaMA model in C/C++ developed by Georgi Gerganov. Llama.cpp allows the inference of LLaMA and other supported models in C/C++. For CPU inference Llama.cpp supports AVX2/AVX-512, ARM NEON, and other modern ISAs along with features like OpenBLAS usage. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b1808 Model: llama-2-7b.Q4_0.gguf Linux 6.9-rc2 Linux 6.8 0.5603 1.1206 1.6809 2.2412 2.8015 SE +/- 0.42, N = 12 SE +/- 0.06, N = 6 2.49 1.99 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas
OSPRay OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.1 Benchmark: particle_volume/pathtracer/real_time Linux 6.8 Linux 6.9-rc2 20 40 60 80 100 SE +/- 1.85, N = 12 SE +/- 1.98, N = 9 102.95 98.87
TensorFlow OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 32 - Model: ResNet-50 Linux 6.8 Linux 6.9-rc2 8 16 24 32 40 SE +/- 0.48, N = 15 SE +/- 0.42, N = 15 35.30 34.86
VVenC OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.11 Video Input: Bosphorus 4K - Video Preset: Fast Linux 6.9-rc2 Linux 6.8 1.1106 2.2212 3.3318 4.4424 5.553 SE +/- 0.077, N = 12 SE +/- 0.075, N = 10 4.936 4.902 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
Llamafile Mozilla's Llamafile allows distributing and running large language models (LLMs) as a single file. Llamafile aims to make open-source LLMs more accessible to developers and users. Llamafile supports a variety of models, CPUs and GPUs, and other options. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.6 Test: llava-v1.5-7b-q4 - Acceleration: CPU Linux 6.8 Linux 6.9-rc2 0.414 0.828 1.242 1.656 2.07 SE +/- 0.07, N = 12 SE +/- 0.05, N = 15 1.84 1.56
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. The system/openssl test profiles relies on benchmarking the system/OS-supplied openssl binary rather than the pts/openssl test profile that uses the locally-built OpenSSL for benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org byte/s, More Is Better OpenSSL Algorithm: SHA256 Linux 6.8 Linux 6.9-rc2 20000M 40000M 60000M 80000M 100000M SE +/- 767626108.54, N = 3 SE +/- 717052715.91, N = 12 84678420667 81678284176 1. OpenSSL 3.0.10 1 Aug 2023 (Library: OpenSSL 3.0.10 1 Aug 2023)
TensorFlow OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 256 - Model: ResNet-50 Linux 6.9-rc2 Linux 6.8 16 32 48 64 80 SE +/- 0.47, N = 3 SE +/- 0.38, N = 3 70.53 70.10
OSPRay Studio OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU Linux 6.8 Linux 6.9-rc2 3K 6K 9K 12K 15K SE +/- 268.32, N = 12 SE +/- 235.77, N = 15 14836 15415
TensorFlow OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 64 - Model: ResNet-50 Linux 6.9-rc2 Linux 6.8 11 22 33 44 55 SE +/- 0.42, N = 3 SE +/- 0.54, N = 12 47.96 47.03
OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 256 - Model: GoogLeNet Linux 6.9-rc2 Linux 6.8 40 80 120 160 200 SE +/- 1.95, N = 3 SE +/- 2.20, N = 12 203.63 202.37
WebP2 Image Encode This is a test of Google's libwebp2 library with the WebP2 image encode utility and using a sample 6000x4000 pixel JPEG image as the input, similar to the WebP/libwebp test profile. WebP2 is currently experimental and under heavy development as ultimately the successor to WebP. WebP2 supports 10-bit HDR, more efficienct lossy compression, improved lossless compression, animation support, and full multi-threading support compared to WebP. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better WebP2 Image Encode 20220823 Encode Settings: Quality 100, Lossless Compression Linux 6.9-rc2 Linux 6.8 0.0158 0.0316 0.0474 0.0632 0.079 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.07 0.07 1. (CXX) g++ options: -msse4.2 -fno-rtti -O3 -ldl
Llama.cpp Llama.cpp is a port of Facebook's LLaMA model in C/C++ developed by Georgi Gerganov. Llama.cpp allows the inference of LLaMA and other supported models in C/C++. For CPU inference Llama.cpp supports AVX2/AVX-512, ARM NEON, and other modern ISAs along with features like OpenBLAS usage. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b1808 Model: llama-2-13b.Q4_0.gguf Linux 6.9-rc2 Linux 6.8 0.3443 0.6886 1.0329 1.3772 1.7215 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 1.53 1.48 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas
OSPRay Studio OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU Linux 6.9-rc2 Linux 6.8 400 800 1200 1600 2000 SE +/- 22.91, N = 15 SE +/- 40.26, N = 15 1774 1801
FFmpeg This is a benchmark of the FFmpeg multimedia framework. The FFmpeg test profile is making use of a modified version of vbench from Columbia University's Architecture and Design Lab (ARCADE) [http://arcade.cs.columbia.edu/vbench/] that is a benchmark for video-as-a-service workloads. The test profile offers the options of a range of vbench scenarios based on freely distributable video content and offers the options of using the x264 or x265 video encoders for transcoding. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FPS, More Is Better FFmpeg 6.1 Encoder: libx265 - Scenario: Live Linux 6.9-rc2 Linux 6.8 30 60 90 120 150 SE +/- 1.80, N = 15 SE +/- 1.36, N = 15 113.99 112.43 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
TensorFlow OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 1 - Model: ResNet-50 Linux 6.9-rc2 Linux 6.8 0.4275 0.855 1.2825 1.71 2.1375 SE +/- 0.03, N = 15 SE +/- 0.03, N = 15 1.90 1.88
Llamafile Mozilla's Llamafile allows distributing and running large language models (LLMs) as a single file. Llamafile aims to make open-source LLMs more accessible to developers and users. Llamafile supports a variety of models, CPUs and GPUs, and other options. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.6 Test: wizardcoder-python-34b-v1.0.Q6_K - Acceleration: CPU Linux 6.8 Linux 6.9-rc2 0.1778 0.3556 0.5334 0.7112 0.889 SE +/- 0.01, N = 4 SE +/- 0.00, N = 3 0.79 0.77
oneDNN OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Recurrent Neural Network Training - Engine: CPU Linux 6.8 Linux 6.9-rc2 2K 4K 6K 8K 10K SE +/- 22.69, N = 3 SE +/- 498.89, N = 13 4821.32 10195.75 MIN: 4659.99 MIN: 4568.3 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenVINO OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU Linux 6.8 Linux 6.9-rc2 0.0855 0.171 0.2565 0.342 0.4275 SE +/- 0.01, N = 15 SE +/- 0.00, N = 15 0.38 0.38 MIN: 0.23 / MAX: 46.5 MIN: 0.24 / MAX: 50.65 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU Linux 6.8 Linux 6.9-rc2 30K 60K 90K 120K 150K SE +/- 1455.69, N = 15 SE +/- 1149.56, N = 15 121702.12 119735.57 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
VVenC OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.11 Video Input: Bosphorus 4K - Video Preset: Faster Linux 6.8 Linux 6.9-rc2 3 6 9 12 15 SE +/- 0.200, N = 12 SE +/- 0.172, N = 15 9.847 9.524 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
FFmpeg This is a benchmark of the FFmpeg multimedia framework. The FFmpeg test profile is making use of a modified version of vbench from Columbia University's Architecture and Design Lab (ARCADE) [http://arcade.cs.columbia.edu/vbench/] that is a benchmark for video-as-a-service workloads. The test profile offers the options of a range of vbench scenarios based on freely distributable video content and offers the options of using the x264 or x265 video encoders for transcoding. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FPS, More Is Better FFmpeg 6.1 Encoder: libx264 - Scenario: Upload Linux 6.9-rc2 Linux 6.8 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 11.51 11.50 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
OSPRay Studio OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 1080p - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU Linux 6.8 Linux 6.9-rc2 120 240 360 480 600 SE +/- 4.36, N = 15 SE +/- 4.85, N = 15 569 570
ONNX Runtime ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Model Zoo. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: yolov4 - Device: CPU - Executor: Standard Linux 6.9-rc2 Linux 6.8 30 60 90 120 150 SE +/- 1.33, N = 13 SE +/- 2.31, N = 15 112.43 122.24 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel Linux 6.9-rc2 Linux 6.8 160 320 480 640 800 SE +/- 14.65, N = 12 SE +/- 25.74, N = 15 668.70 747.08 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: GPT-2 - Device: CPU - Executor: Standard Linux 6.8 Linux 6.9-rc2 2 4 6 8 10 SE +/- 0.16304, N = 14 SE +/- 0.19739, N = 13 6.03604 6.57623 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: bertsquad-12 - Device: CPU - Executor: Standard Linux 6.9-rc2 Linux 6.8 16 32 48 64 80 SE +/- 2.28, N = 15 SE +/- 2.24, N = 12 66.00 70.96 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard Linux 6.9-rc2 Linux 6.8 10 20 30 40 50 SE +/- 0.86, N = 15 SE +/- 0.81, N = 12 41.50 42.78 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OSPRay Studio OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 1080p - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU Linux 6.8 Linux 6.9-rc2 1600 3200 4800 6400 8000 SE +/- 105.66, N = 3 SE +/- 118.69, N = 15 7402 7506
ONNX Runtime ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Model Zoo. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard Linux 6.9-rc2 Linux 6.8 3 6 9 12 15 SE +/- 0.21896, N = 15 SE +/- 0.20919, N = 12 8.23121 9.01538 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: T5 Encoder - Device: CPU - Executor: Parallel Linux 6.9-rc2 Linux 6.8 1.2177 2.4354 3.6531 4.8708 6.0885 SE +/- 0.06160, N = 14 SE +/- 0.09595, N = 12 5.10755 5.41191 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OSPRay Studio OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU Linux 6.9-rc2 Linux 6.8 13K 26K 39K 52K 65K SE +/- 567.54, N = 7 SE +/- 443.96, N = 15 61162 61363
Blender OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: Pabellon Barcelona - Compute: CPU-Only Linux 6.9-rc2 Linux 6.8 20 40 60 80 100 SE +/- 0.61, N = 15 SE +/- 0.83, N = 5 75.56 76.20
JPEG-XL libjxl OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: JPEG - Quality: 80 Linux 6.8 Linux 6.9-rc2 8 16 24 32 40 SE +/- 0.51, N = 12 SE +/- 0.53, N = 15 34.60 34.19 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
OSPRay Studio OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU Linux 6.8 Linux 6.9-rc2 16K 32K 48K 64K 80K SE +/- 506.90, N = 3 SE +/- 855.02, N = 15 72172 73367
Quicksilver Quicksilver is a proxy application that represents some elements of the Mercury workload by solving a simplified dynamic Monte Carlo particle transport problem. Quicksilver is developed by Lawrence Livermore National Laboratory (LLNL) and this test profile currently makes use of the OpenMP CPU threaded code path. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Figure Of Merit, More Is Better Quicksilver 20230818 Input: CORAL2 P1 Linux 6.9-rc2 Linux 6.8 1.5M 3M 4.5M 6M 7.5M SE +/- 46491.34, N = 3 SE +/- 67589.78, N = 6 6916667 6814667 1. (CXX) g++ options: -fopenmp -O3 -march=native
Blender OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: Junkshop - Compute: CPU-Only Linux 6.8 Linux 6.9-rc2 11 22 33 44 55 SE +/- 0.47, N = 15 SE +/- 0.40, N = 15 45.60 46.56
TensorFlow OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 16 - Model: ResNet-50 Linux 6.9-rc2 Linux 6.8 6 12 18 24 30 SE +/- 0.28, N = 15 SE +/- 0.30, N = 3 24.80 24.12
FFmpeg This is a benchmark of the FFmpeg multimedia framework. The FFmpeg test profile is making use of a modified version of vbench from Columbia University's Architecture and Design Lab (ARCADE) [http://arcade.cs.columbia.edu/vbench/] that is a benchmark for video-as-a-service workloads. The test profile offers the options of a range of vbench scenarios based on freely distributable video content and offers the options of using the x264 or x265 video encoders for transcoding. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FPS, More Is Better FFmpeg 6.1 Encoder: libx265 - Scenario: Upload Linux 6.8 Linux 6.9-rc2 5 10 15 20 25 SE +/- 0.31, N = 3 SE +/- 0.26, N = 4 22.31 22.24 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
OSPRay Studio OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU Linux 6.9-rc2 Linux 6.8 3K 6K 9K 12K 15K SE +/- 180.26, N = 3 SE +/- 112.81, N = 9 13807 13867
FFmpeg This is a benchmark of the FFmpeg multimedia framework. The FFmpeg test profile is making use of a modified version of vbench from Columbia University's Architecture and Design Lab (ARCADE) [http://arcade.cs.columbia.edu/vbench/] that is a benchmark for video-as-a-service workloads. The test profile offers the options of a range of vbench scenarios based on freely distributable video content and offers the options of using the x264 or x265 video encoders for transcoding. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FPS, More Is Better FFmpeg 6.1 Encoder: libx264 - Scenario: Platform Linux 6.9-rc2 Linux 6.8 10 20 30 40 50 SE +/- 0.10, N = 3 SE +/- 0.17, N = 3 42.77 42.76 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
OpenBenchmarking.org FPS, More Is Better FFmpeg 6.1 Encoder: libx264 - Scenario: Video On Demand Linux 6.8 Linux 6.9-rc2 10 20 30 40 50 SE +/- 0.06, N = 3 SE +/- 0.10, N = 3 42.94 42.88 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
OSPRay Studio OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU Linux 6.8 Linux 6.9-rc2 500 1000 1500 2000 2500 SE +/- 21.99, N = 4 SE +/- 26.16, N = 13 2108 2164
Llamafile Mozilla's Llamafile allows distributing and running large language models (LLMs) as a single file. Llamafile aims to make open-source LLMs more accessible to developers and users. Llamafile supports a variety of models, CPUs and GPUs, and other options. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.6 Test: mistral-7b-instruct-v0.2.Q8_0 - Acceleration: CPU Linux 6.9-rc2 Linux 6.8 1.0305 2.061 3.0915 4.122 5.1525 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 4.58 4.50
OpenVINO OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU Linux 6.9-rc2 Linux 6.8 0.1755 0.351 0.5265 0.702 0.8775 SE +/- 0.03, N = 3 SE +/- 0.01, N = 15 0.71 0.78 MIN: 0.32 / MAX: 74.71 MIN: 0.3 / MAX: 138.08 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU Linux 6.9-rc2 Linux 6.8 16K 32K 48K 64K 80K SE +/- 1055.18, N = 3 SE +/- 711.41, N = 15 73754.74 67789.25 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
ONNX Runtime ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Model Zoo. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: GPT-2 - Device: CPU - Executor: Parallel Linux 6.9-rc2 Linux 6.8 3 6 9 12 15 SE +/- 0.12308, N = 15 SE +/- 0.11034, N = 3 9.46496 9.80514 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: T5 Encoder - Device: CPU - Executor: Standard Linux 6.9-rc2 Linux 6.8 0.6514 1.3028 1.9542 2.6056 3.257 SE +/- 0.01855, N = 3 SE +/- 0.12453, N = 15 2.48234 2.89501 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
TensorFlow OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 32 - Model: GoogLeNet Linux 6.9-rc2 Linux 6.8 20 40 60 80 100 SE +/- 1.10, N = 15 SE +/- 1.27, N = 12 97.53 96.31
ONNX Runtime ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Model Zoo. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel Linux 6.9-rc2 Linux 6.8 0.7703 1.5406 2.3109 3.0812 3.8515 SE +/- 0.02459, N = 3 SE +/- 0.03367, N = 15 3.40972 3.42357 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard Linux 6.9-rc2 Linux 6.8 0.4158 0.8316 1.2474 1.6632 2.079 SE +/- 0.01721, N = 3 SE +/- 0.02432, N = 15 1.74973 1.84807 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: super-resolution-10 - Device: CPU - Executor: Standard Linux 6.9-rc2 Linux 6.8 1.0984 2.1968 3.2952 4.3936 5.492 SE +/- 0.01745, N = 3 SE +/- 0.07270, N = 15 4.67811 4.88181 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. The system/openssl test profiles relies on benchmarking the system/OS-supplied openssl binary rather than the pts/openssl test profile that uses the locally-built OpenSSL for benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org byte/s, More Is Better OpenSSL Algorithm: SHA512 Linux 6.8 Linux 6.9-rc2 6000M 12000M 18000M 24000M 30000M SE +/- 92272355.83, N = 3 SE +/- 252323355.66, N = 3 29940735740 29665499053 1. OpenSSL 3.0.10 1 Aug 2023 (Library: OpenSSL 3.0.10 1 Aug 2023)
JPEG-XL libjxl OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: PNG - Quality: 80 Linux 6.9-rc2 Linux 6.8 8 16 24 32 40 SE +/- 0.36, N = 15 SE +/- 0.38, N = 3 33.05 32.48 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
TensorFlow OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 64 - Model: GoogLeNet Linux 6.8 Linux 6.9-rc2 30 60 90 120 150 SE +/- 0.94, N = 3 SE +/- 1.70, N = 15 135.18 128.71
OSPRay OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.1 Benchmark: gravity_spheres_volume/dim_512/scivis/real_time Linux 6.9-rc2 Linux 6.8 3 6 9 12 15 SE +/- 0.07, N = 3 SE +/- 0.08, N = 15 10.82 10.50
OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.1 Benchmark: gravity_spheres_volume/dim_512/ao/real_time Linux 6.8 Linux 6.9-rc2 3 6 9 12 15 SE +/- 0.14, N = 15 SE +/- 0.05, N = 3 10.70 10.51
OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.1 Benchmark: particle_volume/scivis/real_time Linux 6.8 Linux 6.9-rc2 7 14 21 28 35 SE +/- 0.26, N = 3 SE +/- 0.33, N = 3 29.28 29.07
OSPRay Studio OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU Linux 6.9-rc2 Linux 6.8 400 800 1200 1600 2000 SE +/- 26.21, N = 3 SE +/- 13.93, N = 11 1839 1861
Embree Intel Embree is a collection of high-performance ray-tracing kernels for execution on CPUs (and GPUs via SYCL) and supporting instruction sets such as SSE, AVX, AVX2, and AVX-512. Embree also supports making use of the Intel SPMD Program Compiler (ISPC). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer ISPC - Model: Asian Dragon Obj Linux 6.9-rc2 Linux 6.8 9 18 27 36 45 SE +/- 0.88, N = 15 SE +/- 0.94, N = 15 37.43 36.44 MIN: 27.23 / MAX: 49.01 MIN: 22.56 / MAX: 49.05
OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer - Model: Asian Dragon Obj Linux 6.8 Linux 6.9-rc2 9 18 27 36 45 SE +/- 0.92, N = 15 SE +/- 0.60, N = 15 39.82 36.50 MIN: 26.64 / MAX: 50.91 MIN: 22.66 / MAX: 46.51
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Pipe Linux 6.8 Linux 6.9-rc2 7M 14M 21M 28M 35M SE +/- 749025.19, N = 15 SE +/- 2106194.07, N = 15 30377557.97 29814480.67 1. (CXX) g++ options: -lm -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Crypto Linux 6.9-rc2 Linux 6.8 30K 60K 90K 120K 150K SE +/- 2039.96, N = 15 SE +/- 1720.03, N = 15 147978.33 143967.24 1. (CXX) g++ options: -lm -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
ONNX Runtime ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Model Zoo. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard Linux 6.9-rc2 Linux 6.8 50 100 150 200 250 SE +/- 1.80, N = 3 SE +/- 11.46, N = 12 184.79 231.28 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
oneDNN OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Recurrent Neural Network Inference - Engine: CPU Linux 6.8 Linux 6.9-rc2 160 320 480 640 800 SE +/- 7.46, N = 3 SE +/- 6.16, N = 9 748.50 753.36 MIN: 677.68 MIN: 652.83 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Glibc C String Functions Linux 6.9-rc2 Linux 6.8 20M 40M 60M 80M 100M SE +/- 1504028.14, N = 15 SE +/- 2138024.53, N = 13 85035507.55 80643523.24 1. (CXX) g++ options: -lm -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
Speedb Speedb is a next-generation key value storage engine that is RocksDB compatible and aiming for stability, efficiency, and performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Read Random Write Random Linux 6.8 Linux 6.9-rc2 300K 600K 900K 1200K 1500K SE +/- 8674.90, N = 11 SE +/- 13609.90, N = 3 1181758 1175562 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Fused Multiply-Add Linux 6.8 Linux 6.9-rc2 40M 80M 120M 160M 200M SE +/- 6187723.80, N = 12 SE +/- 3824188.11, N = 15 204365411.59 200468629.32 1. (CXX) g++ options: -lm -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
OSPRay Studio OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU Linux 6.8 Linux 6.9-rc2 8K 16K 24K 32K 40K SE +/- 302.37, N = 15 SE +/- 483.46, N = 3 35317 35422
TensorFlow OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 16 - Model: GoogLeNet Linux 6.9-rc2 Linux 6.8 13 26 39 52 65 SE +/- 1.24, N = 12 SE +/- 1.31, N = 12 58.83 58.50
OSPRay Studio OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU Linux 6.9-rc2 Linux 6.8 9K 18K 27K 36K 45K SE +/- 252.89, N = 3 SE +/- 626.71, N = 12 39794 40657
srsRAN Project OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240219 Test: PDSCH Processor Benchmark, Throughput Total Linux 6.8 Linux 6.9-rc2 4K 8K 12K 16K 20K SE +/- 263.34, N = 15 SE +/- 377.19, N = 12 17006.8 16877.6 1. (CXX) g++ options: -march=native -mavx2 -mavx -msse4.1 -mfma -mavx512f -mavx512cd -mavx512bw -mavx512dq -O3 -fno-trapping-math -fno-math-errno -ldl
CacheBench This is a performance test of CacheBench, which is part of LLCbench. CacheBench is designed to test the memory and cache bandwidth performance Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Read Linux 6.8 Linux 6.9-rc2 3K 6K 9K 12K 15K SE +/- 0.52, N = 3 SE +/- 0.88, N = 3 13344.70 13341.47 MIN: 13338.06 / MAX: 13346.23 MIN: 13335.31 / MAX: 13343.85 1. (CC) gcc options: -O3 -lrt
OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Read / Modify / Write Linux 6.8 Linux 6.9-rc2 20K 40K 60K 80K 100K SE +/- 2.71, N = 3 SE +/- 71.07, N = 3 99112.70 98711.08 MIN: 88982.6 / MAX: 105154.55 MIN: 85707.72 / MAX: 105115.44 1. (CC) gcc options: -O3 -lrt
OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Write Linux 6.9-rc2 Linux 6.8 20K 40K 60K 80K 100K SE +/- 65.10, N = 3 SE +/- 36.95, N = 3 93862.76 93036.30 MIN: 54012.69 / MAX: 104121.12 MIN: 54037.87 / MAX: 103869.49 1. (CC) gcc options: -O3 -lrt
Blender OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: Fishy Cat - Compute: CPU-Only Linux 6.8 Linux 6.9-rc2 9 18 27 36 45 SE +/- 0.38, N = 15 SE +/- 0.44, N = 4 38.09 39.37
OSPRay OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.1 Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time Linux 6.9-rc2 Linux 6.8 6 12 18 24 30 SE +/- 0.20, N = 3 SE +/- 0.41, N = 12 25.17 24.28
OSPRay Studio OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU Linux 6.8 Linux 6.9-rc2 14K 28K 42K 56K 70K SE +/- 591.53, N = 7 SE +/- 885.61, N = 3 63514 63529
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Floating Point Linux 6.8 Linux 6.9-rc2 6K 12K 18K 24K 30K SE +/- 632.70, N = 12 SE +/- 214.66, N = 11 29600.02 29139.60 1. (CXX) g++ options: -lm -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: MEMFD Linux 6.9-rc2 Linux 6.8 150 300 450 600 750 SE +/- 5.94, N = 8 SE +/- 14.68, N = 15 688.76 688.26 1. (CXX) g++ options: -lm -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
Embree Intel Embree is a collection of high-performance ray-tracing kernels for execution on CPUs (and GPUs via SYCL) and supporting instruction sets such as SSE, AVX, AVX2, and AVX-512. Embree also supports making use of the Intel SPMD Program Compiler (ISPC). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer - Model: Crown Linux 6.8 Linux 6.9-rc2 8 16 24 32 40 SE +/- 0.59, N = 15 SE +/- 1.07, N = 15 36.60 35.08 MIN: 25.71 / MAX: 54.49 MIN: 22.8 / MAX: 59.06
OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer ISPC - Model: Asian Dragon Linux 6.9-rc2 Linux 6.8 8 16 24 32 40 SE +/- 0.82, N = 15 SE +/- 0.77, N = 13 32.83 31.60 MIN: 22.56 / MAX: 48.16 MIN: 20.8 / MAX: 52.83
OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer - Model: Asian Dragon Linux 6.9-rc2 Linux 6.8 8 16 24 32 40 SE +/- 0.54, N = 15 SE +/- 0.86, N = 15 35.65 35.40 MIN: 24.54 / MAX: 51.88 MIN: 23.5 / MAX: 52.61
VVenC OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.11 Video Input: Bosphorus 1080p - Video Preset: Faster Linux 6.8 Linux 6.9-rc2 8 16 24 32 40 SE +/- 0.29, N = 15 SE +/- 0.36, N = 15 33.99 33.09 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
JPEG-XL libjxl OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: JPEG - Quality: 90 Linux 6.9-rc2 Linux 6.8 8 16 24 32 40 SE +/- 0.38, N = 15 SE +/- 0.37, N = 3 33.89 32.56 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
Xmrig Xmrig is an open-source cross-platform CPU/GPU miner for RandomX, KawPow, CryptoNight and AstroBWT. This test profile is setup to measure the Xmrig CPU mining performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org H/s, More Is Better Xmrig 6.21 Variant: GhostRider - Hash Count: 1M Linux 6.9-rc2 Linux 6.8 2K 4K 6K 8K 10K SE +/- 42.79, N = 3 SE +/- 31.05, N = 3 9602.7 9592.3 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
OSPRay OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.1 Benchmark: particle_volume/ao/real_time Linux 6.8 Linux 6.9-rc2 7 14 21 28 35 SE +/- 0.11, N = 3 SE +/- 0.19, N = 3 29.73 28.88
OSPRay Studio OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU Linux 6.9-rc2 Linux 6.8 4K 8K 12K 16K 20K SE +/- 156.23, N = 6 SE +/- 101.18, N = 3 16687 17204
TensorFlow OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 1 - Model: GoogLeNet Linux 6.8 Linux 6.9-rc2 2 4 6 8 10 SE +/- 0.13, N = 12 SE +/- 0.14, N = 15 6.05 5.98
SVT-AV1 OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 4 - Input: Bosphorus 4K Linux 6.8 Linux 6.9-rc2 1.3196 2.6392 3.9588 5.2784 6.598 SE +/- 0.035, N = 3 SE +/- 0.051, N = 15 5.865 5.760 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Xmrig Xmrig is an open-source cross-platform CPU/GPU miner for RandomX, KawPow, CryptoNight and AstroBWT. This test profile is setup to measure the Xmrig CPU mining performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org H/s, More Is Better Xmrig 6.21 Variant: Monero - Hash Count: 1M Linux 6.9-rc2 Linux 6.8 7K 14K 21K 28K 35K SE +/- 181.84, N = 3 SE +/- 320.50, N = 13 33015.5 32361.2 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
OSPRay Studio OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 1080p - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU Linux 6.9-rc2 Linux 6.8 2K 4K 6K 8K 10K SE +/- 26.21, N = 3 SE +/- 70.21, N = 3 8552 8732
OSPRay Studio OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 1080p - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU Linux 6.9-rc2 Linux 6.8 1500 3000 4500 6000 7500 SE +/- 15.17, N = 3 SE +/- 83.68, N = 3 6804 6909
RocksDB OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Read Random Write Random Linux 6.8 Linux 6.9-rc2 500K 1000K 1500K 2000K 2500K SE +/- 17412.72, N = 3 SE +/- 22543.45, N = 5 2120844 2068588 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OSPRay Studio OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 1080p - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU Linux 6.9-rc2 Linux 6.8 110 220 330 440 550 SE +/- 2.33, N = 3 SE +/- 5.26, N = 5 489 501
OpenVINO OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16-INT8 - Device: CPU Linux 6.9-rc2 Linux 6.8 0.9113 1.8226 2.7339 3.6452 4.5565 SE +/- 0.01, N = 3 SE +/- 0.05, N = 4 3.89 4.05 MIN: 2.64 / MAX: 47.09 MIN: 2.67 / MAX: 52.43 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16-INT8 - Device: CPU Linux 6.9-rc2 Linux 6.8 6K 12K 18K 24K 30K SE +/- 155.42, N = 3 SE +/- 285.82, N = 4 27790.24 26732.88 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
WebP2 Image Encode This is a test of Google's libwebp2 library with the WebP2 image encode utility and using a sample 6000x4000 pixel JPEG image as the input, similar to the WebP/libwebp test profile. WebP2 is currently experimental and under heavy development as ultimately the successor to WebP. WebP2 supports 10-bit HDR, more efficienct lossy compression, improved lossless compression, animation support, and full multi-threading support compared to WebP. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better WebP2 Image Encode 20220823 Encode Settings: Quality 95, Compression Effort 7 Linux 6.8 Linux 6.9-rc2 0.0765 0.153 0.2295 0.306 0.3825 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.34 0.33 1. (CXX) g++ options: -msse4.2 -fno-rtti -O3 -ldl
SVT-AV1 OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 8 - Input: Bosphorus 4K Linux 6.8 Linux 6.9-rc2 11 22 33 44 55 SE +/- 1.10, N = 12 SE +/- 1.18, N = 15 48.94 48.57 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
TensorFlow OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 256 - Model: AlexNet Linux 6.8 Linux 6.9-rc2 160 320 480 640 800 SE +/- 9.01, N = 3 SE +/- 6.62, N = 7 723.16 711.74
OpenVINO OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection FP16-INT8 - Device: CPU Linux 6.8 Linux 6.9-rc2 70 140 210 280 350 SE +/- 0.74, N = 3 SE +/- 0.81, N = 3 337.21 337.78 MIN: 261.24 / MAX: 615.09 MIN: 253.21 / MAX: 781.06 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection FP16-INT8 - Device: CPU Linux 6.8 Linux 6.9-rc2 70 140 210 280 350 SE +/- 0.75, N = 3 SE +/- 0.82, N = 3 331.23 330.73 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Noise Suppression Poconet-Like FP16 - Device: CPU Linux 6.8 Linux 6.9-rc2 6 12 18 24 30 SE +/- 0.17, N = 3 SE +/- 0.20, N = 3 23.68 23.97 MIN: 11.64 / MAX: 175.69 MIN: 11.24 / MAX: 358.1 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Noise Suppression Poconet-Like FP16 - Device: CPU Linux 6.8 Linux 6.9-rc2 1000 2000 3000 4000 5000 SE +/- 54.03, N = 3 SE +/- 29.05, N = 3 4621.32 4600.95 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Vehicle Bike Detection FP16 - Device: CPU Linux 6.8 Linux 6.9-rc2 4 8 12 16 20 SE +/- 0.06, N = 3 SE +/- 0.07, N = 3 17.52 17.58 MIN: 11.97 / MAX: 108.98 MIN: 12.03 / MAX: 66.23 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Vehicle Bike Detection FP16 - Device: CPU Linux 6.8 Linux 6.9-rc2 1400 2800 4200 5600 7000 SE +/- 18.88, N = 3 SE +/- 23.77, N = 3 6336.78 6306.42 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection FP16 - Device: CPU Linux 6.9-rc2 Linux 6.8 70 140 210 280 350 SE +/- 1.30, N = 3 SE +/- 0.84, N = 3 302.77 307.74 MIN: 181.55 / MAX: 894.73 MIN: 183.58 / MAX: 728.14 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection FP16 - Device: CPU Linux 6.9-rc2 Linux 6.8 30 60 90 120 150 SE +/- 0.55, N = 3 SE +/- 0.33, N = 3 121.93 119.92 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU Linux 6.8 Linux 6.9-rc2 20 40 60 80 100 SE +/- 0.09, N = 3 SE +/- 0.28, N = 3 77.34 77.57 MIN: 56.48 / MAX: 158.67 MIN: 57.16 / MAX: 148.67 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU Linux 6.8 Linux 6.9-rc2 300 600 900 1200 1500 SE +/- 1.59, N = 3 SE +/- 5.16, N = 3 1446.42 1442.21 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Detection FP16 - Device: CPU Linux 6.9-rc2 Linux 6.8 20 40 60 80 100 SE +/- 0.20, N = 3 SE +/- 0.43, N = 3 79.23 81.90 MIN: 50.1 / MAX: 290.06 MIN: 49.43 / MAX: 345.35 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Detection FP16 - Device: CPU Linux 6.9-rc2 Linux 6.8 100 200 300 400 500 SE +/- 1.20, N = 3 SE +/- 2.33, N = 3 466.45 451.13 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Detection FP32 - Device: CPU Linux 6.9-rc2 Linux 6.8 20 40 60 80 100 SE +/- 0.62, N = 3 SE +/- 0.57, N = 3 79.00 82.87 MIN: 49.24 / MAX: 319.95 MIN: 51.69 / MAX: 529.56 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Detection FP32 - Device: CPU Linux 6.9-rc2 Linux 6.8 100 200 300 400 500 SE +/- 3.72, N = 3 SE +/- 3.09, N = 3 467.79 445.90 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Re-Identification Retail FP16 - Device: CPU Linux 6.8 Linux 6.9-rc2 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 9.59 9.65 MIN: 7.62 / MAX: 27.98 MIN: 7.62 / MAX: 34.18 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Re-Identification Retail FP16 - Device: CPU Linux 6.8 Linux 6.9-rc2 2K 4K 6K 8K 10K SE +/- 20.55, N = 3 SE +/- 28.95, N = 3 11638.51 11564.78 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Blender OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: BMW27 - Compute: CPU-Only Linux 6.8 Linux 6.9-rc2 5 10 15 20 25 SE +/- 0.33, N = 14 SE +/- 0.08, N = 3 21.82 22.02
OpenVINO OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Machine Translation EN To DE FP16 - Device: CPU Linux 6.9-rc2 Linux 6.8 12 24 36 48 60 SE +/- 0.25, N = 3 SE +/- 0.47, N = 3 54.83 54.98 MIN: 34.71 / MAX: 375.04 MIN: 33.9 / MAX: 423.36 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Machine Translation EN To DE FP16 - Device: CPU Linux 6.9-rc2 Linux 6.8 150 300 450 600 750 SE +/- 3.16, N = 3 SE +/- 5.75, N = 3 673.46 671.66 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
oneDNN OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Deconvolution Batch shapes_1d - Engine: CPU Linux 6.9-rc2 Linux 6.8 5 10 15 20 25 SE +/- 0.14, N = 3 SE +/- 0.23, N = 15 18.03 18.34 MIN: 13.29 MIN: 10.09 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenVINO OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16-INT8 - Device: CPU Linux 6.8 Linux 6.9-rc2 5 10 15 20 25 SE +/- 0.04, N = 3 SE +/- 0.08, N = 3 22.06 22.12 MIN: 15.51 / MAX: 86.25 MIN: 15.73 / MAX: 71.5 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16-INT8 - Device: CPU Linux 6.8 Linux 6.9-rc2 1100 2200 3300 4400 5500 SE +/- 9.46, N = 3 SE +/- 18.34, N = 3 5067.40 5052.71 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
TensorFlow OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 32 - Model: AlexNet Linux 6.8 Linux 6.9-rc2 80 160 240 320 400 SE +/- 2.86, N = 15 SE +/- 5.16, N = 15 346.10 341.97
OpenVINO OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16 - Device: CPU Linux 6.9-rc2 Linux 6.8 8 16 24 32 40 SE +/- 0.23, N = 3 SE +/- 0.35, N = 3 35.14 36.28 MIN: 25.25 / MAX: 437.27 MIN: 24.78 / MAX: 205.05 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16 - Device: CPU Linux 6.9-rc2 Linux 6.8 200 400 600 800 1000 SE +/- 6.69, N = 3 SE +/- 9.82, N = 3 1051.30 1018.47 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
ONNX Runtime ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Model Zoo. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel Linux 6.8 Linux 6.9-rc2 10 20 30 40 50 SE +/- 0.23, N = 3 SE +/- 0.13, N = 3 42.68 43.15 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard Linux 6.9-rc2 Linux 6.8 7 14 21 28 35 SE +/- 0.22, N = 3 SE +/- 0.38, N = 3 29.11 30.82 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenVINO OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16 - Device: CPU Linux 6.9-rc2 Linux 6.8 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 9.89 9.92 MIN: 7.63 / MAX: 36.57 MIN: 7.45 / MAX: 33.82 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16 - Device: CPU Linux 6.9-rc2 Linux 6.8 2K 4K 6K 8K 10K SE +/- 26.72, N = 3 SE +/- 23.83, N = 3 11295.92 11260.98 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Embree Intel Embree is a collection of high-performance ray-tracing kernels for execution on CPUs (and GPUs via SYCL) and supporting instruction sets such as SSE, AVX, AVX2, and AVX-512. Embree also supports making use of the Intel SPMD Program Compiler (ISPC). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer ISPC - Model: Crown Linux 6.8 Linux 6.9-rc2 9 18 27 36 45 SE +/- 0.47, N = 4 SE +/- 1.44, N = 12 38.84 35.73 MIN: 30.82 / MAX: 52.25 MIN: 20.32 / MAX: 57.51
OpenVINO OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16 - Device: CPU Linux 6.9-rc2 Linux 6.8 2 4 6 8 10 SE +/- 0.05, N = 3 SE +/- 0.08, N = 3 6.76 6.85 MIN: 4.36 / MAX: 70.91 MIN: 4.47 / MAX: 51.04 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16 - Device: CPU Linux 6.9-rc2 Linux 6.8 4K 8K 12K 16K 20K SE +/- 107.94, N = 3 SE +/- 194.69, N = 3 16349.05 16147.57 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16-INT8 - Device: CPU Linux 6.9-rc2 Linux 6.8 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.07, N = 3 7.14 7.17 MIN: 5.79 / MAX: 44.66 MIN: 5.78 / MAX: 54.3 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16-INT8 - Device: CPU Linux 6.9-rc2 Linux 6.8 3K 6K 9K 12K 15K SE +/- 25.09, N = 3 SE +/- 132.85, N = 3 15589.59 15532.54 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16 - Device: CPU Linux 6.9-rc2 Linux 6.8 4 8 12 16 20 SE +/- 0.14, N = 3 SE +/- 0.10, N = 3 13.89 14.58 MIN: 9.12 / MAX: 90.57 MIN: 9.5 / MAX: 109.83 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16 - Device: CPU Linux 6.9-rc2 Linux 6.8 600 1200 1800 2400 3000 SE +/- 25.96, N = 3 SE +/- 16.72, N = 3 2656.42 2530.04 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16 - Device: CPU Linux 6.8 Linux 6.9-rc2 7 14 21 28 35 SE +/- 0.03, N = 3 SE +/- 0.09, N = 3 30.85 31.01 MIN: 24.82 / MAX: 82.03 MIN: 24.42 / MAX: 90.86 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16 - Device: CPU Linux 6.8 Linux 6.9-rc2 800 1600 2400 3200 4000 SE +/- 3.14, N = 3 SE +/- 10.89, N = 3 3626.41 3608.21 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
ONNX Runtime ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Model Zoo. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: bertsquad-12 - Device: CPU - Executor: Parallel Linux 6.8 Linux 6.9-rc2 20 40 60 80 100 SE +/- 0.33, N = 3 SE +/- 0.44, N = 3 92.75 94.33 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: yolov4 - Device: CPU - Executor: Parallel Linux 6.9-rc2 Linux 6.8 30 60 90 120 150 SE +/- 1.32, N = 3 SE +/- 1.18, N = 3 124.38 126.38 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenVINO OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16-INT8 - Device: CPU Linux 6.8 Linux 6.9-rc2 7 14 21 28 35 SE +/- 0.08, N = 3 SE +/- 0.07, N = 3 29.61 30.20 MIN: 25.51 / MAX: 92.13 MIN: 22.08 / MAX: 107.87 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16-INT8 - Device: CPU Linux 6.8 Linux 6.9-rc2 800 1600 2400 3200 4000 SE +/- 10.67, N = 3 SE +/- 8.36, N = 3 3777.56 3704.38 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
ONNX Runtime ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Model Zoo. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel Linux 6.9-rc2 Linux 6.8 13 26 39 52 65 SE +/- 0.56, N = 3 SE +/- 0.81, N = 3 56.95 59.60 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
Speedb Speedb is a next-generation key value storage engine that is RocksDB compatible and aiming for stability, efficiency, and performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Random Fill Linux 6.8 Linux 6.9-rc2 70K 140K 210K 280K 350K SE +/- 2148.02, N = 3 SE +/- 3492.28, N = 3 326499 322816 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Random Fill Sync Linux 6.9-rc2 Linux 6.8 60K 120K 180K 240K 300K SE +/- 3603.61, N = 3 SE +/- 1490.88, N = 3 259603 257446 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Update Random Linux 6.9-rc2 Linux 6.8 60K 120K 180K 240K 300K SE +/- 3235.16, N = 3 SE +/- 3488.80, N = 3 293414 292180 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Overwrite Linux 6.9-rc2 Linux 6.8 80K 160K 240K 320K 400K SE +/- 3053.81, N = 3 SE +/- 881.98, N = 3 391968 388010 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Update Random Linux 6.9-rc2 Linux 6.8 80K 160K 240K 320K 400K SE +/- 1522.72, N = 3 SE +/- 1470.46, N = 3 361502 359825 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
ONNX Runtime ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Model Zoo. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel Linux 6.9-rc2 Linux 6.8 3 6 9 12 15 SE +/- 0.14, N = 3 SE +/- 0.10, N = 3 10.51 10.61 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
Speedb Speedb is a next-generation key value storage engine that is RocksDB compatible and aiming for stability, efficiency, and performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Random Read Linux 6.8 Linux 6.9-rc2 80M 160M 240M 320M 400M SE +/- 1002006.26, N = 3 SE +/- 4146957.10, N = 3 380419389 375837479 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
ONNX Runtime ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Model Zoo. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: super-resolution-10 - Device: CPU - Executor: Parallel Linux 6.9-rc2 Linux 6.8 2 4 6 8 10 SE +/- 0.06370, N = 3 SE +/- 0.04024, N = 3 6.51127 6.55011 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. The system/openssl test profiles relies on benchmarking the system/OS-supplied openssl binary rather than the pts/openssl test profile that uses the locally-built OpenSSL for benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org verify/s, More Is Better OpenSSL Algorithm: RSA4096 Linux 6.9-rc2 Linux 6.8 300K 600K 900K 1200K 1500K SE +/- 833.27, N = 3 SE +/- 19064.62, N = 3 1396904.0 1358616.4 1. OpenSSL 3.0.10 1 Aug 2023 (Library: OpenSSL 3.0.10 1 Aug 2023)
OpenBenchmarking.org sign/s, More Is Better OpenSSL Algorithm: RSA4096 Linux 6.8 Linux 6.9-rc2 8K 16K 24K 32K 40K SE +/- 46.27, N = 3 SE +/- 90.23, N = 3 38204.6 38147.4 1. OpenSSL 3.0.10 1 Aug 2023 (Library: OpenSSL 3.0.10 1 Aug 2023)
RocksDB OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Random Read Linux 6.9-rc2 Linux 6.8 80M 160M 240M 320M 400M SE +/- 1447290.34, N = 3 SE +/- 1573786.50, N = 3 361586083 360498518 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OSPRay Studio OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 1080p - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU Linux 6.8 Linux 6.9-rc2 100 200 300 400 500 SE +/- 3.71, N = 3 SE +/- 2.33, N = 3 466 478
Blender OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: Classroom - Compute: CPU-Only Linux 6.9-rc2 Linux 6.8 12 24 36 48 60 SE +/- 0.09, N = 3 SE +/- 0.10, N = 3 53.82 53.98
TensorFlow OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 16 - Model: AlexNet Linux 6.8 Linux 6.9-rc2 50 100 150 200 250 SE +/- 1.58, N = 15 SE +/- 1.89, N = 15 213.35 210.71
OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 64 - Model: AlexNet Linux 6.8 Linux 6.9-rc2 110 220 330 440 550 SE +/- 6.05, N = 3 SE +/- 5.17, N = 15 495.72 471.88
FFmpeg This is a benchmark of the FFmpeg multimedia framework. The FFmpeg test profile is making use of a modified version of vbench from Columbia University's Architecture and Design Lab (ARCADE) [http://arcade.cs.columbia.edu/vbench/] that is a benchmark for video-as-a-service workloads. The test profile offers the options of a range of vbench scenarios based on freely distributable video content and offers the options of using the x264 or x265 video encoders for transcoding. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FPS, More Is Better FFmpeg 6.1 Encoder: libx264 - Scenario: Live Linux 6.9-rc2 Linux 6.8 40 80 120 160 200 SE +/- 0.70, N = 3 SE +/- 1.29, N = 3 182.12 181.51 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
OSPRay Studio OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU Linux 6.8 Linux 6.9-rc2 7K 14K 21K 28K 35K SE +/- 294.16, N = 3 SE +/- 454.99, N = 3 34412 34953
WebP2 Image Encode This is a test of Google's libwebp2 library with the WebP2 image encode utility and using a sample 6000x4000 pixel JPEG image as the input, similar to the WebP/libwebp test profile. WebP2 is currently experimental and under heavy development as ultimately the successor to WebP. WebP2 supports 10-bit HDR, more efficienct lossy compression, improved lossless compression, animation support, and full multi-threading support compared to WebP. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better WebP2 Image Encode 20220823 Encode Settings: Quality 75, Compression Effort 7 Linux 6.8 Linux 6.9-rc2 0.144 0.288 0.432 0.576 0.72 SE +/- 0.01, N = 3 SE +/- 0.01, N = 4 0.64 0.61 1. (CXX) g++ options: -msse4.2 -fno-rtti -O3 -ldl
VVenC OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.11 Video Input: Bosphorus 1080p - Video Preset: Fast Linux 6.9-rc2 Linux 6.8 4 8 12 16 20 SE +/- 0.07, N = 3 SE +/- 0.11, N = 3 16.15 15.91 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
JPEG-XL libjxl OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: PNG - Quality: 90 Linux 6.8 Linux 6.9-rc2 7 14 21 28 35 SE +/- 0.28, N = 3 SE +/- 0.18, N = 3 30.97 30.13 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
Primesieve OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.1 Length: 1e13 Linux 6.9-rc2 Linux 6.8 9 18 27 36 45 SE +/- 0.09, N = 3 SE +/- 0.06, N = 3 39.46 39.62 1. (CXX) g++ options: -O3
SVT-AV1 OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 12 - Input: Bosphorus 4K Linux 6.8 Linux 6.9-rc2 30 60 90 120 150 SE +/- 5.13, N = 15 SE +/- 5.21, N = 15 127.20 126.19 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
oneDNN OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: IP Shapes 1D - Engine: CPU Linux 6.8 Linux 6.9-rc2 1.0151 2.0302 3.0453 4.0604 5.0755 SE +/- 0.04580, N = 3 SE +/- 0.05180, N = 12 4.50055 4.51166 MIN: 3.52 MIN: 3.37 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Xmrig Xmrig is an open-source cross-platform CPU/GPU miner for RandomX, KawPow, CryptoNight and AstroBWT. This test profile is setup to measure the Xmrig CPU mining performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org H/s, More Is Better Xmrig 6.21 Variant: CryptoNight-Femto UPX2 - Hash Count: 1M Linux 6.8 Linux 6.9-rc2 7K 14K 21K 28K 35K SE +/- 110.64, N = 3 SE +/- 170.37, N = 3 33002.2 32993.8 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
OpenBenchmarking.org H/s, More Is Better Xmrig 6.21 Variant: CryptoNight-Heavy - Hash Count: 1M Linux 6.8 Linux 6.9-rc2 7K 14K 21K 28K 35K SE +/- 49.86, N = 3 SE +/- 160.32, N = 3 32856.3 32826.1 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
SVT-AV1 OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 13 - Input: Bosphorus 4K Linux 6.8 Linux 6.9-rc2 30 60 90 120 150 SE +/- 3.82, N = 12 SE +/- 5.42, N = 15 123.25 123.10 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Xmrig Xmrig is an open-source cross-platform CPU/GPU miner for RandomX, KawPow, CryptoNight and AstroBWT. This test profile is setup to measure the Xmrig CPU mining performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org H/s, More Is Better Xmrig 6.21 Variant: KawPow - Hash Count: 1M Linux 6.9-rc2 Linux 6.8 7K 14K 21K 28K 35K SE +/- 137.96, N = 3 SE +/- 68.36, N = 3 33072.5 32860.8 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
SVT-AV1 OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 4 - Input: Bosphorus 1080p Linux 6.9-rc2 Linux 6.8 5 10 15 20 25 SE +/- 0.21, N = 3 SE +/- 0.13, N = 15 18.93 18.66 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
GROMACS The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2024 Implementation: MPI CPU - Input: water_GMX50_bare Linux 6.8 Linux 6.9-rc2 3 6 9 12 15 SE +/- 0.06, N = 3 SE +/- 0.08, N = 3 12.09 11.76 1. (CXX) g++ options: -O3 -lm
Xmrig Xmrig is an open-source cross-platform CPU/GPU miner for RandomX, KawPow, CryptoNight and AstroBWT. This test profile is setup to measure the Xmrig CPU mining performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org H/s, More Is Better Xmrig 6.21 Variant: Wownero - Hash Count: 1M Linux 6.9-rc2 Linux 6.8 8K 16K 24K 32K 40K SE +/- 490.33, N = 3 SE +/- 245.88, N = 3 36180.7 36096.6 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Socket Activity Linux 6.8 Linux 6.9-rc2 10K 20K 30K 40K 50K SE +/- 79.93, N = 3 SE +/- 21.40, N = 3 47927.75 46123.32 1. (CXX) g++ options: -lm -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Zlib Linux 6.8 Linux 6.9-rc2 2K 4K 6K 8K 10K SE +/- 25.13, N = 3 SE +/- 124.93, N = 3 9670.37 9446.07 1. (CXX) g++ options: -lm -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Mixed Scheduler Linux 6.8 Linux 6.9-rc2 15K 30K 45K 60K 75K SE +/- 153.71, N = 3 SE +/- 281.29, N = 3 70407.96 70036.20 1. (CXX) g++ options: -lm -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Glibc Qsort Data Sorting Linux 6.8 Linux 6.9-rc2 400 800 1200 1600 2000 SE +/- 4.40, N = 3 SE +/- 4.16, N = 3 1866.58 1858.70 1. (CXX) g++ options: -lm -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Semaphores Linux 6.8 Linux 6.9-rc2 30M 60M 90M 120M 150M SE +/- 1498916.25, N = 3 SE +/- 1330349.25, N = 3 154589461.19 150005776.47 1. (CXX) g++ options: -lm -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: CPU Stress Linux 6.8 Linux 6.9-rc2 40K 80K 120K 160K 200K SE +/- 569.84, N = 3 SE +/- 335.84, N = 3 202073.08 200671.70 1. (CXX) g++ options: -lm -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: SENDFILE Linux 6.9-rc2 Linux 6.8 400K 800K 1200K 1600K 2000K SE +/- 5873.31, N = 3 SE +/- 609.53, N = 3 1900682.08 1798203.12 1. (CXX) g++ options: -lm -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Poll Linux 6.8 Linux 6.9-rc2 2M 4M 6M 8M 10M SE +/- 37074.52, N = 3 SE +/- 30731.99, N = 3 10389186.02 10232390.41 1. (CXX) g++ options: -lm -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Hash Linux 6.9-rc2 Linux 6.8 4M 8M 12M 16M 20M SE +/- 192774.29, N = 3 SE +/- 82496.53, N = 3 17199190.09 17071694.52 1. (CXX) g++ options: -lm -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: AVL Tree Linux 6.8 Linux 6.9-rc2 200 400 600 800 1000 SE +/- 6.53, N = 3 SE +/- 2.13, N = 3 810.19 808.99 1. (CXX) g++ options: -lm -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Vector Shuffle Linux 6.8 Linux 6.9-rc2 120K 240K 360K 480K 600K SE +/- 1416.85, N = 3 SE +/- 1897.90, N = 3 576905.92 569593.00 1. (CXX) g++ options: -lm -latomic -lc -lcrypt -ldl -lEGL -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.4 Compression Level: 3 - Decompression Speed Linux 6.8 Linux 6.9-rc2 700 1400 2100 2800 3500 SE +/- 33.92, N = 3 SE +/- 34.17, N = 3 3364.8 3349.6 1. (CC) gcc options: -O3
OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.4 Compression Level: 1 - Compression Speed Linux 6.8 Linux 6.9-rc2 130 260 390 520 650 SE +/- 1.15, N = 3 SE +/- 0.16, N = 3 602.69 602.54 1. (CC) gcc options: -O3
WebP2 Image Encode This is a test of Google's libwebp2 library with the WebP2 image encode utility and using a sample 6000x4000 pixel JPEG image as the input, similar to the WebP/libwebp test profile. WebP2 is currently experimental and under heavy development as ultimately the successor to WebP. WebP2 supports 10-bit HDR, more efficienct lossy compression, improved lossless compression, animation support, and full multi-threading support compared to WebP. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better WebP2 Image Encode 20220823 Encode Settings: Default Linux 6.8 Linux 6.9-rc2 1.3275 2.655 3.9825 5.31 6.6375 SE +/- 0.11, N = 15 SE +/- 0.09, N = 15 5.90 5.65 1. (CXX) g++ options: -msse4.2 -fno-rtti -O3 -ldl
TensorFlow OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 1 - Model: AlexNet Linux 6.9-rc2 Linux 6.8 4 8 12 16 20 SE +/- 0.25, N = 3 SE +/- 0.25, N = 12 18.28 17.23
JPEG-XL libjxl OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: PNG - Quality: 100 Linux 6.8 Linux 6.9-rc2 6 12 18 24 30 SE +/- 0.15, N = 3 SE +/- 0.21, N = 3 25.67 25.61 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: JPEG - Quality: 100 Linux 6.8 Linux 6.9-rc2 6 12 18 24 30 SE +/- 0.11, N = 3 SE +/- 0.08, N = 3 26.00 25.96 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
oneDNN OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Convolution Batch Shapes Auto - Engine: CPU Linux 6.9-rc2 Linux 6.8 3 6 9 12 15 SE +/- 0.09863, N = 15 SE +/- 0.03328, N = 3 8.81043 9.00621 MIN: 6.25 MIN: 8.12 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
srsRAN Project OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240219 Test: PDSCH Processor Benchmark, Throughput Thread Linux 6.8 Linux 6.9-rc2 140 280 420 560 700 SE +/- 26.44, N = 12 SE +/- 29.77, N = 14 662.4 650.8 1. (CXX) g++ options: -march=native -mavx2 -mavx -msse4.1 -mfma -mavx512f -mavx512cd -mavx512bw -mavx512dq -O3 -fno-trapping-math -fno-math-errno -ldl
Parallel BZIP2 Compression OpenBenchmarking.org Seconds, Fewer Is Better Parallel BZIP2 Compression 1.1.13 FreeBSD-13.0-RELEASE-amd64-memstick.img Compression Linux 6.9-rc2 Linux 6.8 0.6601 1.3202 1.9803 2.6404 3.3005 SE +/- 0.102058, N = 12 SE +/- 0.081558, N = 15 2.926551 2.933717 1. (CXX) g++ options: -O2 -pthread -lbz2 -lpthread
oneDNN OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: IP Shapes 3D - Engine: CPU Linux 6.8 Linux 6.9-rc2 1.0896 2.1792 3.2688 4.3584 5.448 SE +/- 0.02352, N = 3 SE +/- 0.02357, N = 3 4.79853 4.84245 MIN: 3.98 MIN: 4.01 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Google Draco OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Church Facade Linux 6.8 Linux 6.9-rc2 1700 3400 5100 6800 8500 SE +/- 0.88, N = 3 SE +/- 9.56, N = 3 7747 7762 1. (CXX) g++ options: -O3
WebP2 Image Encode This is a test of Google's libwebp2 library with the WebP2 image encode utility and using a sample 6000x4000 pixel JPEG image as the input, similar to the WebP/libwebp test profile. WebP2 is currently experimental and under heavy development as ultimately the successor to WebP. WebP2 supports 10-bit HDR, more efficienct lossy compression, improved lossless compression, animation support, and full multi-threading support compared to WebP. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better WebP2 Image Encode 20220823 Encode Settings: Quality 100, Compression Effort 5 Linux 6.9-rc2 Linux 6.8 2 4 6 8 10 SE +/- 0.07, N = 3 SE +/- 0.11, N = 15 8.45 8.35 1. (CXX) g++ options: -msse4.2 -fno-rtti -O3 -ldl
oneDNN OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Deconvolution Batch shapes_3d - Engine: CPU Linux 6.9-rc2 Linux 6.8 2 4 6 8 10 SE +/- 0.04472, N = 3 SE +/- 0.10591, N = 15 5.78202 6.16576 MIN: 3.81 MIN: 3.75 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Primesieve OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.1 Length: 1e12 Linux 6.9-rc2 Linux 6.8 0.7551 1.5102 2.2653 3.0204 3.7755 SE +/- 0.044, N = 3 SE +/- 0.024, N = 11 3.299 3.356 1. (CXX) g++ options: -O3
Google Draco OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Lion Linux 6.9-rc2 Linux 6.8 1300 2600 3900 5200 6500 SE +/- 29.28, N = 3 SE +/- 40.84, N = 3 6054 6129 1. (CXX) g++ options: -O3
Linux 6.8 Processor: 2 x Intel Xeon Max 9480 @ 3.50GHz (112 Cores / 224 Threads), Motherboard: Supermicro SYS-221H-TNR X13DEM v1.10 (1.3 BIOS), Chipset: Intel Device 1bce, Memory: 512GB, Disk: 7682GB INTEL SSDPF2KX076TZ, Graphics: ASPEED, Network: 2 x Broadcom BCM57508 NetXtreme-E 10Gb/25Gb/40Gb/50Gb/100Gb/200Gb
OS: Ubuntu 23.10, Kernel: 6.8.0-060800-generic (x86_64), Desktop: GNOME Shell 45.0, Display Server: X Server 1.21.1.7, Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1024x768
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: intel_cpufreq performance - CPU Microcode: 0x2c000290Python Notes: Python 3.11.6Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 29 March 2024 19:39 by user phoronix.
Linux 6.9-rc2 Processor: 2 x Intel Xeon Max 9480 @ 3.50GHz (112 Cores / 224 Threads), Motherboard: Supermicro SYS-221H-TNR X13DEM v1.10 (1.3 BIOS), Chipset: Intel Device 1bce, Memory: 512GB, Disk: 7682GB INTEL SSDPF2KX076TZ, Graphics: ASPEED, Network: 2 x Broadcom BCM57508 NetXtreme-E 10Gb/25Gb/40Gb/50Gb/100Gb/200Gb
OS: Ubuntu 23.10, Kernel: 6.9.0-060900rc2-generic (x86_64), Desktop: GNOME Shell 45.0, Display Server: X Server 1.21.1.7, Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1024x768
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: intel_cpufreq performance - CPU Microcode: 0x2c000290Python Notes: Python 3.11.6Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 1 April 2024 10:41 by user phoronix.