Tests for a future article. Intel Xeon Gold 6421N testing with a Quanta Cloud S6Q-MB-MPS (3A10.uh BIOS) and ASPEED on Ubuntu 22.04 via the Phoronix Test Suite.
a Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: intel_pstate performance (EPP: performance) - CPU Microcode: 0x2b0000c0Java Notes: OpenJDK Runtime Environment (build 11.0.16+8-post-Ubuntu-0ubuntu122.04)Python Notes: Python 3.10.6Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
b Processor: Intel Xeon Gold 6421N @ 3.60GHz (32 Cores / 64 Threads), Motherboard: Quanta Cloud S6Q-MB-MPS (3A10.uh BIOS), Chipset: Intel Device 1bce, Memory: 512GB, Disk: 3 x 3841GB Micron_9300_MTFDHAL3T8TDP, Graphics: ASPEED, Monitor: VGA HDMI, Network: 4 x Intel E810-C for QSFP
OS: Ubuntu 22.04, Kernel: 5.15.0-47-generic (x86_64), Desktop: GNOME Shell 42.4, Display Server: X Server 1.21.1.3, Vulkan: 1.2.204, Compiler: GCC 11.2.0, File-System: ext4, Screen Resolution: 1600x1200
new xeon OpenBenchmarking.org Phoronix Test Suite Intel Xeon Gold 6421N @ 3.60GHz (32 Cores / 64 Threads) Quanta Cloud S6Q-MB-MPS (3A10.uh BIOS) Intel Device 1bce 512GB 3 x 3841GB Micron_9300_MTFDHAL3T8TDP ASPEED VGA HDMI 4 x Intel E810-C for QSFP Ubuntu 22.04 5.15.0-47-generic (x86_64) GNOME Shell 42.4 X Server 1.21.1.3 1.2.204 GCC 11.2.0 ext4 1600x1200 Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Vulkan Compiler File-System Screen Resolution New Xeon Benchmarks System Logs - Transparent Huge Pages: madvise - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: intel_pstate performance (EPP: performance) - CPU Microcode: 0x2b0000c0 - OpenJDK Runtime Environment (build 11.0.16+8-post-Ubuntu-0ubuntu122.04) - Python 3.10.6 - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
a vs. b Comparison Phoronix Test Suite Baseline +9.5% +9.5% +19% +19% +28.5% +28.5% 37.8% 22.7% 7.1% 6.5% 6.2% 6.2% 5.8% 4.8% 4.4% 4.4% 4% 3.9% 3.2% 2.8% 2.8% 2.7% 2.7% 2.7% 2.3% 2% 100 - 100 - 200 100 - 100 - 200 26% CPU Cache 256 15.9% 200 - 100 - 200 100 - 100 - 500 500 - 1 - 500 6.2% c2c - Stock - double - 128 B.L.N.Q.A - A.M.S Redis - 100 - 1:10 6.2% 200 - 100 - 200 5.9% B.L.N.Q.A - A.M.S 100 - 100 - 500 5.4% 500 - 1 - 500 Cloning 4.4% N.T.C.B.b.u.S - A.M.S N.T.C.B.b.u.S - A.M.S 500 - 1 - 200 r2c - Stock - float - 256 500 - 1 - 200 3.6% c2c - FFTW - double - 128 3.4% Futex 3.3% P.P.B.T.T Pipe r2c - FFTW - float - 256 100 - 1 - 200 200 - 1 - 200 SENDFILE Redis - 100 - 1:5 2.6% Matrix Math 2.5% 500 - 100 - 500 2.5% 200 - 100 - 500 2.4% 200 - 1 - 500 2.4% 200 - 100 - 500 16 - 256 - 512 Apache IoTDB Apache IoTDB Stress-NG libxsmm Apache IoTDB Apache IoTDB Apache IoTDB HeFFTe - Highly Efficient FFT for Exascale Neural Magic DeepSparse Redis 7.0.12 + memtier_benchmark Apache IoTDB Neural Magic DeepSparse Apache IoTDB Apache IoTDB Stress-NG Neural Magic DeepSparse Neural Magic DeepSparse Apache IoTDB HeFFTe - Highly Efficient FFT for Exascale Apache IoTDB HeFFTe - Highly Efficient FFT for Exascale Stress-NG srsRAN Project Stress-NG HeFFTe - Highly Efficient FFT for Exascale Apache IoTDB Apache IoTDB Stress-NG Redis 7.0.12 + memtier_benchmark Stress-NG Apache IoTDB Apache IoTDB Apache IoTDB Apache IoTDB Liquid-DSP a b
new xeon openfoam: drivaerFastback, Medium Mesh Size - Execution Time openfoam: drivaerFastback, Medium Mesh Size - Mesh Time brl-cad: VGR Performance Metric blender: Barbershop - CPU-Only build-linux-kernel: allmodconfig build-llvm: Unix Makefiles hpcg: 160 160 160 - 60 libxsmm: 128 build-llvm: Ninja hpcg: 144 144 144 - 60 blender: Pabellon Barcelona - CPU-Only laghos: Sedov Blast Wave, ube_922_hex.mesh hpcg: 104 104 104 - 60 libxsmm: 256 blender: Classroom - CPU-Only cassandra: Writes deepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Stream vvenc: Bosphorus 4K - Fast openfoam: drivaerFastback, Small Mesh Size - Execution Time openfoam: drivaerFastback, Small Mesh Size - Mesh Time palabos: 100 memtier-benchmark: Redis - 100 - 1:10 palabos: 400 memtier-benchmark: Redis - 100 - 1:5 apache-iotdb: 500 - 100 - 500 apache-iotdb: 500 - 100 - 500 memtier-benchmark: Redis - 50 - 1:5 memtier-benchmark: Redis - 50 - 1:10 palabos: 500 blender: Fishy Cat - CPU-Only laghos: Triple Point Problem deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream vvenc: Bosphorus 4K - Faster deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream blender: BMW27 - CPU-Only heffte: c2c - Stock - double - 512 heffte: c2c - FFTW - double - 512 apache-iotdb: 200 - 100 - 500 apache-iotdb: 200 - 100 - 500 deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream build-php: Time To Compile build-gdb: Time To Compile build-linux-kernel: defconfig deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream vvenc: Bosphorus 1080p - Fast deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream apache-iotdb: 500 - 100 - 200 apache-iotdb: 500 - 100 - 200 srsran: PUSCH Processor Benchmark, Throughput Total stress-ng: IO_uring stress-ng: Atomic apache-iotdb: 500 - 1 - 500 apache-iotdb: 500 - 1 - 500 stress-ng: CPU Cache stress-ng: MMAP stress-ng: Cloning stress-ng: Malloc stress-ng: MEMFD stress-ng: Zlib stress-ng: Glibc Qsort Data Sorting stress-ng: Fused Multiply-Add stress-ng: Pthread stress-ng: System V Message Passing stress-ng: Hash stress-ng: Vector Math liquid-dsp: 64 - 256 - 512 stress-ng: Futex stress-ng: Socket Activity stress-ng: Vector Shuffle stress-ng: Matrix 3D Math stress-ng: NUMA stress-ng: Vector Floating Point stress-ng: Pipe stress-ng: Wide Vector Math stress-ng: x86_64 RdRand stress-ng: AVL Tree stress-ng: Forking stress-ng: CPU Stress stress-ng: Glibc C String Functions stress-ng: Function Call stress-ng: Matrix Math stress-ng: SENDFILE stress-ng: Crypto stress-ng: Mutex stress-ng: Context Switching liquid-dsp: 32 - 256 - 512 stress-ng: Floating Point stress-ng: Memory Copying stress-ng: Semaphores stress-ng: Poll liquid-dsp: 16 - 256 - 512 liquid-dsp: 64 - 256 - 57 liquid-dsp: 64 - 256 - 32 liquid-dsp: 32 - 256 - 57 liquid-dsp: 32 - 256 - 32 liquid-dsp: 16 - 256 - 57 liquid-dsp: 16 - 256 - 32 heffte: c2c - Stock - float - 512 heffte: r2c - FFTW - double - 512 heffte: r2c - Stock - double - 512 apache-iotdb: 100 - 100 - 500 apache-iotdb: 100 - 100 - 500 heffte: c2c - FFTW - float - 512 apache-iotdb: 200 - 100 - 200 apache-iotdb: 200 - 100 - 200 apache-iotdb: 200 - 1 - 500 apache-iotdb: 200 - 1 - 500 apache-iotdb: 500 - 1 - 200 apache-iotdb: 500 - 1 - 200 vvenc: Bosphorus 1080p - Faster srsran: Downlink Processor Benchmark apache-iotdb: 100 - 100 - 200 apache-iotdb: 100 - 100 - 200 apache-iotdb: 100 - 1 - 500 apache-iotdb: 100 - 1 - 500 apache-iotdb: 200 - 1 - 200 apache-iotdb: 200 - 1 - 200 apache-iotdb: 100 - 1 - 200 apache-iotdb: 100 - 1 - 200 libxsmm: 64 heffte: r2c - Stock - float - 512 heffte: r2c - FFTW - float - 512 libxsmm: 32 srsran: PUSCH Processor Benchmark, Throughput Thread heffte: c2c - FFTW - double - 256 heffte: c2c - Stock - double - 256 heffte: r2c - FFTW - double - 256 heffte: c2c - Stock - float - 256 heffte: c2c - FFTW - float - 256 heffte: r2c - Stock - double - 256 heffte: r2c - FFTW - float - 256 heffte: r2c - Stock - float - 256 heffte: c2c - Stock - double - 128 heffte: c2c - FFTW - double - 128 heffte: c2c - Stock - float - 128 heffte: r2c - Stock - double - 128 heffte: c2c - FFTW - float - 128 heffte: r2c - FFTW - double - 128 heffte: r2c - Stock - float - 128 heffte: r2c - FFTW - float - 128 a b 615.99074 144.69646 466686 493.45 445.385 323.856 27.5086 1211.8 263.154 27.4213 159.94 216.86 27.7808 879.6 127.78 155626 453.4802 35.1529 5.842 67.707331 27.965214 235.186 2447092.01 287.268 2285996.17 68.34 67607191.64 2211638.65 2316281.26 300.276 64.07 177.78 31.6750 504.6114 11.020 14.8600 1074.8218 116.3761 137.3780 460.7818 34.5311 468.8046 33.9358 345.1491 46.3307 47.15 40.7438 43.9665 101.25 45677447.24 131.4497 121.6893 40.9109 390.9076 42.351 41.905 40.438 54.0674 295.8295 76.5807 208.8471 76.5597 208.8975 16.100 33.3278 479.7876 33.3894 478.9108 4.9416 3227.0954 31.58 56894390.61 5372.9 1529665.98 133.83 22.97 1916642.9 1537111.20 861.28 9740.57 99373474.31 549.94 2647.81 696.65 34197705.63 136846.01 5852281.71 5577252.32 151386.31 513135000 1541676.36 24947.14 167204.21 9599.93 390.87 58243.38 35837711.85 1745029.27 331416.52 294.26 89918.21 64111.11 26067360.60 22028.03 160653.44 582724.63 50240.09 15147444.51 2572801.75 383555000 10587.48 7176.19 62126446.21 3669281.69 243940000 1728850000 1577300000 1328100000 847085000 848435000 557945000 72.5609 74.4734 76.6110 69.08 59041436.64 78.8291 29.54 54224351.1 26.29 1505080.34 9.49 1576432.25 30.946 705.8 31.83 43074031.84 28.27 1191500.88 11.86 1045806.81 14.58 710382.44 833.8 137.536 141.41 440.0 240.4 38.9304 38.9613 72.2893 75.0892 76.0299 76.9042 149.825 157.867 46.6357 64.4263 85.7398 92.3973 131.656 121.794 149.935 207.244 615.46018 144.93674 493.61 445.380 319.852 27.3978 1225.0 262.884 27.3890 217.19 27.8405 758.9 127.76 428.6695 37.3253 5.917 67.563163 27.948717 234.874 2304730.19 285.761 2227152.02 68.01 65935725.67 2217192.12 2293467.62 300.855 64.01 176.92 31.6488 505.1309 10.992 14.8473 1075.9571 111.4976 143.4387 460.7588 34.5539 460.6707 34.5447 343.5170 46.5491 47.22 40.6648 44.0064 98.87 46726912.46 131.0664 122.0367 40.8061 391.9125 42.382 42.006 40.451 53.3291 299.9277 75.7218 211.2270 76.4684 208.9908 16.249 33.2781 480.5223 33.3680 479.2241 4.9312 3233.9588 31.69 56137174.7 5543.7 1503623.79 132.61 21.63 2009050.46 1885833.11 856.14 9326.09 99251227.28 549.55 2648.81 696.92 34050669.23 136709.81 5854201.78 5583978.14 151431.15 513040000 1492979.46 25282.31 167202.07 9605.30 392.08 58232.70 36852791.12 1750003.43 331423.04 294.66 89966.29 64118.87 26125214.84 22106.49 156668.43 598173.56 50243.48 15192892.59 2571092.69 378650000 10601.10 7180.43 61651485.43 3671617.97 248820000 1733700000 1576850000 1323900000 847675000 862195000 558655000 72.5391 74.7148 76.6041 73.56 56018457.87 78.9605 31.63 51199962.11 26.64 1469808.89 9.87 1521587.4 30.927 710.9 43.86 34191814.86 28.45 1185338.02 12.18 1042859.03 14.98 697217.55 839.9 137.740 141.193 444.6 236.3 38.5182 38.6757 72.1981 74.9286 75.3001 77.0345 154.053 164.047 49.5230 62.2974 85.4850 90.9851 130.982 122.460 151.803 206.217 OpenBenchmarking.org
OpenFOAM OpenFOAM is the leading free, open-source software for computational fluid dynamics (CFD). This test profile currently uses the drivaerFastback test case for analyzing automotive aerodynamics or alternatively the older motorBike input. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Medium Mesh Size - Execution Time a b 130 260 390 520 650 SE +/- 0.42, N = 2 SE +/- 0.03, N = 2 615.99 615.46 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm
OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Medium Mesh Size - Mesh Time b a 30 60 90 120 150 SE +/- 0.08, N = 2 SE +/- 0.01, N = 2 144.94 144.70 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm
BRL-CAD BRL-CAD is a cross-platform, open-source solid modeling system with built-in benchmark mode. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.36 VGR Performance Metric a 100K 200K 300K 400K 500K SE +/- 3768.50, N = 2 466686 1. (CXX) g++ options: -std=c++14 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -ltcl8.6 -lregex_brl -lz_brl -lnetpbm -ldl -lm -ltk8.6
Blender OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Barbershop - Compute: CPU-Only b a 110 220 330 440 550 SE +/- 0.42, N = 2 SE +/- 0.22, N = 2 493.61 493.45
libxsmm Libxsmm is an open-source library for specialized dense and sparse matrix operations and deep learning primitives. Libxsmm supports making use of Intel AMX, AVX-512, and other modern CPU instruction set capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 128 a b 300 600 900 1200 1500 SE +/- 4.60, N = 2 SE +/- 1.10, N = 2 1211.8 1225.0 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
Laghos Laghos (LAGrangian High-Order Solver) is a miniapp that solves the time-dependent Euler equations of compressible gas dynamics in a moving Lagrangian frame using unstructured high-order finite element spatial discretization and explicit high-order time-stepping. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Major Kernels Total Rate, More Is Better Laghos 3.1 Test: Sedov Blast Wave, ube_922_hex.mesh a b 50 100 150 200 250 SE +/- 0.24, N = 2 SE +/- 0.18, N = 2 216.86 217.19 1. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi
libxsmm Libxsmm is an open-source library for specialized dense and sparse matrix operations and deep learning primitives. Libxsmm supports making use of Intel AMX, AVX-512, and other modern CPU instruction set capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 256 b a 200 400 600 800 1000 SE +/- 5.75, N = 2 SE +/- 0.65, N = 2 758.9 879.6 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
Blender OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Classroom - Compute: CPU-Only a b 30 60 90 120 150 SE +/- 0.05, N = 2 SE +/- 0.13, N = 2 127.78 127.76
Neural Magic DeepSparse OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Stream a b 100 200 300 400 500 SE +/- 0.26, N = 2 SE +/- 4.41, N = 2 453.48 428.67
VVenC VVenC is the Fraunhofer Versatile Video Encoder as a fast/efficient H.266/VVC encoder. The vvenc encoder makes use of SIMD Everywhere (SIMDe). The vvenc software is published under the Clear BSD License. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.9 Video Input: Bosphorus 4K - Video Preset: Fast a b 1.3313 2.6626 3.9939 5.3252 6.6565 SE +/- 0.074, N = 2 SE +/- 0.015, N = 2 5.842 5.917 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
OpenFOAM OpenFOAM is the leading free, open-source software for computational fluid dynamics (CFD). This test profile currently uses the drivaerFastback test case for analyzing automotive aerodynamics or alternatively the older motorBike input. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Execution Time a b 15 30 45 60 75 SE +/- 0.09, N = 2 SE +/- 0.11, N = 2 67.71 67.56 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm
OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Mesh Time a b 7 14 21 28 35 SE +/- 0.02, N = 2 SE +/- 0.05, N = 2 27.97 27.95 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm
Palabos The Palabos library is a framework for general purpose Computational Fluid Dynamics (CFD). Palabos uses a kernel based on the Lattice Boltzmann method. This test profile uses the Palabos MPI-based Cavity3D benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 100 b a 50 100 150 200 250 SE +/- 0.34, N = 2 SE +/- 0.02, N = 2 234.87 235.19 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
Palabos The Palabos library is a framework for general purpose Computational Fluid Dynamics (CFD). Palabos uses a kernel based on the Lattice Boltzmann method. This test profile uses the Palabos MPI-based Cavity3D benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 400 b a 60 120 180 240 300 SE +/- 1.54, N = 2 SE +/- 0.49, N = 2 285.76 287.27 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
Redis 7.0.12 + memtier_benchmark Memtier_benchmark is a NoSQL Redis/Memcache traffic generation plus benchmarking tool developed by Redis Labs. Learn more via the OpenBenchmarking.org test page.
Protocol: Redis - Clients: 500 - Set To Get Ratio: 1:10
a: The test run did not produce a result.
b: The test run did not produce a result.
Protocol: Redis - Clients: 500 - Set To Get Ratio: 1:5
a: The test run did not produce a result.
b: The test run did not produce a result.
OpenBenchmarking.org Ops/sec, More Is Better Redis 7.0.12 + memtier_benchmark 2.0 Protocol: Redis - Clients: 100 - Set To Get Ratio: 1:5 b a 500K 1000K 1500K 2000K 2500K SE +/- 3990.38, N = 2 SE +/- 6000.63, N = 2 2227152.02 2285996.17 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
Apache IoTDB OpenBenchmarking.org Average Latency, More Is Better Apache IoTDB 1.1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 500 b a 15 30 45 60 75 68.01 68.34 MAX: 1606.75 MAX: 2006.68
OpenBenchmarking.org Ops/sec, More Is Better Redis 7.0.12 + memtier_benchmark 2.0 Protocol: Redis - Clients: 50 - Set To Get Ratio: 1:10 b a 500K 1000K 1500K 2000K 2500K SE +/- 4548.93, N = 2 SE +/- 13610.76, N = 2 2293467.62 2316281.26 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
Palabos The Palabos library is a framework for general purpose Computational Fluid Dynamics (CFD). Palabos uses a kernel based on the Lattice Boltzmann method. This test profile uses the Palabos MPI-based Cavity3D benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 500 a b 70 140 210 280 350 SE +/- 1.63, N = 2 SE +/- 1.17, N = 2 300.28 300.86 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
Blender OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Fishy Cat - Compute: CPU-Only a b 14 28 42 56 70 SE +/- 0.08, N = 2 SE +/- 0.20, N = 2 64.07 64.01
Laghos Laghos (LAGrangian High-Order Solver) is a miniapp that solves the time-dependent Euler equations of compressible gas dynamics in a moving Lagrangian frame using unstructured high-order finite element spatial discretization and explicit high-order time-stepping. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Major Kernels Total Rate, More Is Better Laghos 3.1 Test: Triple Point Problem b a 40 80 120 160 200 SE +/- 0.02, N = 2 SE +/- 0.13, N = 2 176.92 177.78 1. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi
Neural Magic DeepSparse OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b 7 14 21 28 35 SE +/- 0.01, N = 2 SE +/- 0.01, N = 2 31.68 31.65
VVenC VVenC is the Fraunhofer Versatile Video Encoder as a fast/efficient H.266/VVC encoder. The vvenc encoder makes use of SIMD Everywhere (SIMDe). The vvenc software is published under the Clear BSD License. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.9 Video Input: Bosphorus 4K - Video Preset: Faster b a 3 6 9 12 15 SE +/- 0.03, N = 2 SE +/- 0.00, N = 2 10.99 11.02 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
Neural Magic DeepSparse OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b 4 8 12 16 20 SE +/- 0.01, N = 2 SE +/- 0.01, N = 2 14.86 14.85
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b 200 400 600 800 1000 SE +/- 0.57, N = 2 SE +/- 1.01, N = 2 1074.82 1075.96
OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream a b 100 200 300 400 500 SE +/- 0.42, N = 2 SE +/- 2.44, N = 2 460.78 460.76
OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream a b 100 200 300 400 500 SE +/- 1.46, N = 2 SE +/- 0.20, N = 2 468.80 460.67
Blender OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: BMW27 - Compute: CPU-Only b a 11 22 33 44 55 SE +/- 0.08, N = 2 SE +/- 0.02, N = 2 47.22 47.15
HeFFTe - Highly Efficient FFT for Exascale HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double - X Y Z: 512 b a 9 18 27 36 45 SE +/- 0.00, N = 2 SE +/- 0.05, N = 2 40.66 40.74 1. (CXX) g++ options: -O3
Apache IoTDB OpenBenchmarking.org Average Latency, More Is Better Apache IoTDB 1.1.2 Device Count: 200 - Batch Size Per Write: 100 - Sensor Count: 500 b a 20 40 60 80 100 98.87 101.25 MAX: 3564.64 MAX: 3631.89
Neural Magic DeepSparse OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream a b 30 60 90 120 150 SE +/- 0.05, N = 2 SE +/- 0.22, N = 2 131.45 131.07
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream a b 30 60 90 120 150 SE +/- 0.05, N = 2 SE +/- 0.22, N = 2 121.69 122.04
OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream a b 9 18 27 36 45 SE +/- 0.11, N = 2 SE +/- 0.01, N = 2 40.91 40.81
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream a b 90 180 270 360 450 SE +/- 1.01, N = 2 SE +/- 0.12, N = 2 390.91 391.91
Neural Magic DeepSparse OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream a b 12 24 36 48 60 SE +/- 0.01, N = 2 SE +/- 0.09, N = 2 54.07 53.33
VVenC VVenC is the Fraunhofer Versatile Video Encoder as a fast/efficient H.266/VVC encoder. The vvenc encoder makes use of SIMD Everywhere (SIMDe). The vvenc software is published under the Clear BSD License. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.9 Video Input: Bosphorus 1080p - Video Preset: Fast a b 4 8 12 16 20 SE +/- 0.17, N = 2 SE +/- 0.02, N = 2 16.10 16.25 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
Apache IoTDB OpenBenchmarking.org Average Latency, More Is Better Apache IoTDB 1.1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 200 a b 7 14 21 28 35 31.58 31.69 MAX: 1920.32 MAX: 1610.79
srsRAN Project srsRAN Project is a complete ORAN-native 5G RAN solution created by Software Radio Systems (SRS). The srsRAN Project radio suite was formerly known as srsLTE and can be used for building your own software-defined radio (SDR) 4G/5G mobile network. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.5 Test: PUSCH Processor Benchmark, Throughput Total a b 1200 2400 3600 4800 6000 SE +/- 143.30, N = 2 SE +/- 95.40, N = 2 5372.9 5543.7 1. (CXX) g++ options: -march=native -mfma -O3 -fno-trapping-math -fno-math-errno -lgtest
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Malloc b a 20M 40M 60M 80M 100M SE +/- 83929.32, N = 2 SE +/- 129754.02, N = 2 99251227.28 99373474.31 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Glibc Qsort Data Sorting a b 150 300 450 600 750 SE +/- 0.40, N = 2 SE +/- 0.46, N = 2 696.65 696.92 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Fused Multiply-Add b a 7M 14M 21M 28M 35M SE +/- 285.63, N = 2 SE +/- 137631.48, N = 2 34050669.23 34197705.63 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Pthread b a 30K 60K 90K 120K 150K SE +/- 102.07, N = 2 SE +/- 971.78, N = 2 136709.81 136846.01 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: System V Message Passing a b 1.3M 2.6M 3.9M 5.2M 6.5M SE +/- 7174.98, N = 2 SE +/- 9802.94, N = 2 5852281.71 5854201.78 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Hash a b 1.2M 2.4M 3.6M 4.8M 6M SE +/- 3166.95, N = 2 SE +/- 2865.25, N = 2 5577252.32 5583978.14 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Vector Math a b 30K 60K 90K 120K 150K SE +/- 47.16, N = 2 SE +/- 5.98, N = 2 151386.31 151431.15 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Liquid-DSP LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 512 b a 110M 220M 330M 440M 550M SE +/- 800000.00, N = 2 SE +/- 385000.00, N = 2 513040000 513135000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Socket Activity a b 5K 10K 15K 20K 25K SE +/- 72.57, N = 2 SE +/- 267.39, N = 2 24947.14 25282.31 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Vector Shuffle b a 40K 80K 120K 160K 200K SE +/- 6.04, N = 2 SE +/- 6.63, N = 2 167202.07 167204.21 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Matrix 3D Math a b 2K 4K 6K 8K 10K SE +/- 34.45, N = 2 SE +/- 4.08, N = 2 9599.93 9605.30 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Vector Floating Point b a 12K 24K 36K 48K 60K SE +/- 4.11, N = 2 SE +/- 30.71, N = 2 58232.70 58243.38 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Pipe a b 8M 16M 24M 32M 40M SE +/- 1105250.10, N = 2 SE +/- 79631.10, N = 2 35837711.85 36852791.12 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Wide Vector Math a b 400K 800K 1200K 1600K 2000K SE +/- 918.08, N = 2 SE +/- 4139.63, N = 2 1745029.27 1750003.43 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: x86_64 RdRand a b 70K 140K 210K 280K 350K SE +/- 2.35, N = 2 SE +/- 1.14, N = 2 331416.52 331423.04 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Forking a b 20K 40K 60K 80K 100K SE +/- 469.20, N = 2 SE +/- 421.24, N = 2 89918.21 89966.29 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: CPU Stress a b 14K 28K 42K 56K 70K SE +/- 12.73, N = 2 SE +/- 38.95, N = 2 64111.11 64118.87 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Glibc C String Functions a b 6M 12M 18M 24M 30M SE +/- 150617.25, N = 2 SE +/- 69329.81, N = 2 26067360.60 26125214.84 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Function Call a b 5K 10K 15K 20K 25K SE +/- 80.03, N = 2 SE +/- 74.09, N = 2 22028.03 22106.49 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Matrix Math b a 30K 60K 90K 120K 150K SE +/- 332.46, N = 2 SE +/- 2867.57, N = 2 156668.43 160653.44 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: SENDFILE a b 130K 260K 390K 520K 650K SE +/- 6799.74, N = 2 SE +/- 243.97, N = 2 582724.63 598173.56 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Crypto a b 11K 22K 33K 44K 55K SE +/- 3.65, N = 2 SE +/- 18.13, N = 2 50240.09 50243.48 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Mutex a b 3M 6M 9M 12M 15M SE +/- 23940.47, N = 2 SE +/- 2864.48, N = 2 15147444.51 15192892.59 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Context Switching b a 600K 1200K 1800K 2400K 3000K SE +/- 604.17, N = 2 SE +/- 678.57, N = 2 2571092.69 2572801.75 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Liquid-DSP LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 32 - Buffer Length: 256 - Filter Length: 512 b a 80M 160M 240M 320M 400M SE +/- 4920000.00, N = 2 SE +/- 1955000.00, N = 2 378650000 383555000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Memory Copying a b 1500 3000 4500 6000 7500 SE +/- 8.71, N = 2 SE +/- 11.04, N = 2 7176.19 7180.43 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Semaphores b a 13M 26M 39M 52M 65M SE +/- 466593.23, N = 2 SE +/- 2077286.42, N = 2 61651485.43 62126446.21 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Poll a b 800K 1600K 2400K 3200K 4000K SE +/- 2536.76, N = 2 SE +/- 1953.54, N = 2 3669281.69 3671617.97 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Liquid-DSP LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 16 - Buffer Length: 256 - Filter Length: 512 a b 50M 100M 150M 200M 250M SE +/- 1950000.00, N = 2 SE +/- 3170000.00, N = 2 243940000 248820000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 57 a b 400M 800M 1200M 1600M 2000M SE +/- 550000.00, N = 2 SE +/- 900000.00, N = 2 1728850000 1733700000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 32 b a 300M 600M 900M 1200M 1500M SE +/- 450000.00, N = 2 SE +/- 300000.00, N = 2 1576850000 1577300000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 32 - Buffer Length: 256 - Filter Length: 57 b a 300M 600M 900M 1200M 1500M SE +/- 4400000.00, N = 2 SE +/- 300000.00, N = 2 1323900000 1328100000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 32 - Buffer Length: 256 - Filter Length: 32 a b 200M 400M 600M 800M 1000M SE +/- 25000.00, N = 2 SE +/- 85000.00, N = 2 847085000 847675000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 16 - Buffer Length: 256 - Filter Length: 57 a b 200M 400M 600M 800M 1000M SE +/- 14365000.00, N = 2 SE +/- 695000.00, N = 2 848435000 862195000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 16 - Buffer Length: 256 - Filter Length: 32 a b 120M 240M 360M 480M 600M SE +/- 2065000.00, N = 2 SE +/- 605000.00, N = 2 557945000 558655000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
HeFFTe - Highly Efficient FFT for Exascale HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: float - X Y Z: 512 b a 16 32 48 64 80 SE +/- 0.00, N = 2 SE +/- 0.21, N = 2 72.54 72.56 1. (CXX) g++ options: -O3
Apache IoTDB OpenBenchmarking.org Average Latency, More Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 500 a b 16 32 48 64 80 69.08 73.56 MAX: 1049.85 MAX: 1309.93
HeFFTe - Highly Efficient FFT for Exascale HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512 a b 20 40 60 80 100 SE +/- 0.36, N = 2 SE +/- 0.06, N = 2 78.83 78.96 1. (CXX) g++ options: -O3
VVenC VVenC is the Fraunhofer Versatile Video Encoder as a fast/efficient H.266/VVC encoder. The vvenc encoder makes use of SIMD Everywhere (SIMDe). The vvenc software is published under the Clear BSD License. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.9 Video Input: Bosphorus 1080p - Video Preset: Faster b a 7 14 21 28 35 SE +/- 0.04, N = 2 SE +/- 0.06, N = 2 30.93 30.95 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
srsRAN Project srsRAN Project is a complete ORAN-native 5G RAN solution created by Software Radio Systems (SRS). The srsRAN Project radio suite was formerly known as srsLTE and can be used for building your own software-defined radio (SDR) 4G/5G mobile network. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.5 Test: Downlink Processor Benchmark a b 150 300 450 600 750 SE +/- 5.15, N = 2 SE +/- 1.60, N = 2 705.8 710.9 1. (CXX) g++ options: -march=native -mfma -O3 -fno-trapping-math -fno-math-errno -lgtest
Apache IoTDB OpenBenchmarking.org Average Latency, More Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 200 a b 10 20 30 40 50 31.83 43.86 MAX: 790.74 MAX: 2550.76
libxsmm Libxsmm is an open-source library for specialized dense and sparse matrix operations and deep learning primitives. Libxsmm supports making use of Intel AMX, AVX-512, and other modern CPU instruction set capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 64 a b 200 400 600 800 1000 SE +/- 1.05, N = 2 SE +/- 0.20, N = 2 833.8 839.9 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
HeFFTe - Highly Efficient FFT for Exascale HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: float - X Y Z: 512 a b 30 60 90 120 150 SE +/- 0.00, N = 2 SE +/- 0.33, N = 2 137.54 137.74 1. (CXX) g++ options: -O3
libxsmm Libxsmm is an open-source library for specialized dense and sparse matrix operations and deep learning primitives. Libxsmm supports making use of Intel AMX, AVX-512, and other modern CPU instruction set capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 32 a b 100 200 300 400 500 SE +/- 0.25, N = 2 SE +/- 0.15, N = 2 440.0 444.6 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
srsRAN Project srsRAN Project is a complete ORAN-native 5G RAN solution created by Software Radio Systems (SRS). The srsRAN Project radio suite was formerly known as srsLTE and can be used for building your own software-defined radio (SDR) 4G/5G mobile network. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.5 Test: PUSCH Processor Benchmark, Throughput Thread b a 50 100 150 200 250 SE +/- 0.10, N = 2 SE +/- 3.55, N = 2 236.3 240.4 1. (CXX) g++ options: -march=native -mfma -O3 -fno-trapping-math -fno-math-errno -lgtest
a Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: intel_pstate performance (EPP: performance) - CPU Microcode: 0x2b0000c0Java Notes: OpenJDK Runtime Environment (build 11.0.16+8-post-Ubuntu-0ubuntu122.04)Python Notes: Python 3.10.6Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 30 July 2023 19:35 by user phoronix.
b Processor: Intel Xeon Gold 6421N @ 3.60GHz (32 Cores / 64 Threads), Motherboard: Quanta Cloud S6Q-MB-MPS (3A10.uh BIOS), Chipset: Intel Device 1bce, Memory: 512GB, Disk: 3 x 3841GB Micron_9300_MTFDHAL3T8TDP, Graphics: ASPEED, Monitor: VGA HDMI, Network: 4 x Intel E810-C for QSFP
OS: Ubuntu 22.04, Kernel: 5.15.0-47-generic (x86_64), Desktop: GNOME Shell 42.4, Display Server: X Server 1.21.1.3, Vulkan: 1.2.204, Compiler: GCC 11.2.0, File-System: ext4, Screen Resolution: 1600x1200
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: intel_pstate performance (EPP: performance) - CPU Microcode: 0x2b0000c0Java Notes: OpenJDK Runtime Environment (build 11.0.16+8-post-Ubuntu-0ubuntu122.04)Python Notes: Python 3.10.6Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 31 July 2023 05:12 by user phoronix.