new xeon

Tests for a future article. Intel Xeon Gold 6421N testing with a Quanta Cloud S6Q-MB-MPS (3A10.uh BIOS) and ASPEED on Ubuntu 22.04 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2307315-NE-NEWXEON9432
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results
Show Result Confidence Charts
Allow Limiting Results To Certain Suite(s)

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Toggle/Hide
Result
Result
Identifier
View Logs
Performance Per
Dollar
Date
Run
  Test
  Duration
a
July 30 2023
  5 Hours, 55 Minutes
b
July 31 2023
  5 Hours, 22 Minutes
Invert Behavior (Only Show Selected Data)
  5 Hours, 38 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


new xeonOpenBenchmarking.orgPhoronix Test SuiteIntel Xeon Gold 6421N @ 3.60GHz (32 Cores / 64 Threads)Quanta Cloud S6Q-MB-MPS (3A10.uh BIOS)Intel Device 1bce512GB3 x 3841GB Micron_9300_MTFDHAL3T8TDPASPEEDVGA HDMI4 x Intel E810-C for QSFPUbuntu 22.045.15.0-47-generic (x86_64)GNOME Shell 42.4X Server 1.21.1.31.2.204GCC 11.2.0ext41600x1200ProcessorMotherboardChipsetMemoryDiskGraphicsMonitorNetworkOSKernelDesktopDisplay ServerVulkanCompilerFile-SystemScreen ResolutionNew Xeon BenchmarksSystem Logs- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: intel_pstate performance (EPP: performance) - CPU Microcode: 0x2b0000c0 - OpenJDK Runtime Environment (build 11.0.16+8-post-Ubuntu-0ubuntu122.04)- Python 3.10.6- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected

a vs. b ComparisonPhoronix Test SuiteBaseline+9.5%+9.5%+19%+19%+28.5%+28.5%37.8%22.7%7.1%6.5%6.2%6.2%5.8%4.8%4.4%4.4%4%3.9%3.2%2.8%2.8%2.7%2.7%2.7%2.3%2%100 - 100 - 200100 - 100 - 20026%CPU Cache25615.9%200 - 100 - 200100 - 100 - 500500 - 1 - 5006.2%c2c - Stock - double - 128B.L.N.Q.A - A.M.SRedis - 100 - 1:106.2%200 - 100 - 2005.9%B.L.N.Q.A - A.M.S100 - 100 - 5005.4%500 - 1 - 500Cloning4.4%N.T.C.B.b.u.S - A.M.SN.T.C.B.b.u.S - A.M.S500 - 1 - 200r2c - Stock - float - 256500 - 1 - 2003.6%c2c - FFTW - double - 1283.4%Futex3.3%P.P.B.T.TPiper2c - FFTW - float - 256100 - 1 - 200200 - 1 - 200SENDFILERedis - 100 - 1:52.6%Matrix Math2.5%500 - 100 - 5002.5%200 - 100 - 5002.4%200 - 1 - 5002.4%200 - 100 - 50016 - 256 - 512Apache IoTDBApache IoTDBStress-NGlibxsmmApache IoTDBApache IoTDBApache IoTDBHeFFTe - Highly Efficient FFT for ExascaleNeural Magic DeepSparseRedis 7.0.12 + memtier_benchmarkApache IoTDBNeural Magic DeepSparseApache IoTDBApache IoTDBStress-NGNeural Magic DeepSparseNeural Magic DeepSparseApache IoTDBHeFFTe - Highly Efficient FFT for ExascaleApache IoTDBHeFFTe - Highly Efficient FFT for ExascaleStress-NGsrsRAN ProjectStress-NGHeFFTe - Highly Efficient FFT for ExascaleApache IoTDBApache IoTDBStress-NGRedis 7.0.12 + memtier_benchmarkStress-NGApache IoTDBApache IoTDBApache IoTDBApache IoTDBLiquid-DSPab

new xeonheffte: c2c - Stock - double - 128heffte: c2c - Stock - double - 512heffte: r2c - FFTW - double - 512heffte: r2c - Stock - float - 256heffte: c2c - Stock - float - 256heffte: r2c - FFTW - double - 128laghos: Triple Point Problemlibxsmm: 32libxsmm: 64libxsmm: 256libxsmm: 128stress-ng: Hashstress-ng: MMAPheffte: r2c - Stock - double - 256laghos: Sedov Blast Wave, ube_922_hex.meshpalabos: 100heffte: c2c - FFTW - float - 128heffte: c2c - Stock - float - 128heffte: c2c - FFTW - float - 256heffte: c2c - Stock - float - 512heffte: c2c - FFTW - float - 512heffte: r2c - FFTW - double - 256heffte: r2c - FFTW - float - 128heffte: r2c - Stock - float - 128heffte: r2c - FFTW - float - 256heffte: r2c - Stock - float - 512heffte: r2c - FFTW - float - 512heffte: c2c - Stock - double - 256heffte: c2c - FFTW - double - 128heffte: r2c - Stock - double - 128heffte: c2c - FFTW - double - 256heffte: r2c - Stock - double - 512heffte: c2c - FFTW - double - 512palabos: 400palabos: 500stress-ng: NUMAstress-ng: Pipestress-ng: Pollstress-ng: Zlibstress-ng: Futexstress-ng: MEMFDstress-ng: Mutexstress-ng: Atomicstress-ng: Cryptostress-ng: Mallocstress-ng: Cloningstress-ng: Forkingstress-ng: Pthreadstress-ng: AVL Treestress-ng: IO_uringstress-ng: SENDFILEstress-ng: CPU Cachestress-ng: CPU Stressstress-ng: Semaphoresstress-ng: Matrix Mathstress-ng: Vector Mathstress-ng: Function Callstress-ng: x86_64 RdRandstress-ng: Floating Pointstress-ng: Matrix 3D Mathstress-ng: Memory Copyingstress-ng: Vector Shufflestress-ng: Socket Activitystress-ng: Wide Vector Mathstress-ng: Context Switchingstress-ng: Fused Multiply-Addstress-ng: Vector Floating Pointstress-ng: Glibc C String Functionsstress-ng: Glibc Qsort Data Sortingstress-ng: System V Message Passingbrl-cad: VGR Performance Metricdeepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Streamdeepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Streamdeepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Streamdeepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Streamdeepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Streamdeepsparse: ResNet-50, Baseline - Asynchronous Multi-Streamdeepsparse: ResNet-50, Baseline - Asynchronous Multi-Streamdeepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Streamdeepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Streamdeepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Streamdeepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Streamdeepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Streamdeepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Streamdeepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Streamdeepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Streamhpcg: 104 104 104 - 60hpcg: 144 144 144 - 60hpcg: 160 160 160 - 60openfoam: drivaerFastback, Small Mesh Size - Mesh Timeopenfoam: drivaerFastback, Small Mesh Size - Execution Timeopenfoam: drivaerFastback, Medium Mesh Size - Mesh Timeopenfoam: drivaerFastback, Medium Mesh Size - Execution Timebuild-gdb: Time To Compilebuild-llvm: Ninjabuild-llvm: Unix Makefilesbuild-php: Time To Compilebuild-linux-kernel: defconfigbuild-linux-kernel: allmodconfigblender: BMW27 - CPU-Onlyblender: Classroom - CPU-Onlyblender: Fishy Cat - CPU-Onlyblender: Barbershop - CPU-Onlyblender: Pabellon Barcelona - CPU-Onlyvvenc: Bosphorus 4K - Fastvvenc: Bosphorus 4K - Fastervvenc: Bosphorus 1080p - Fastvvenc: Bosphorus 1080p - Fasterliquid-dsp: 16 - 256 - 32liquid-dsp: 16 - 256 - 57liquid-dsp: 32 - 256 - 32liquid-dsp: 32 - 256 - 57liquid-dsp: 64 - 256 - 32liquid-dsp: 64 - 256 - 57liquid-dsp: 16 - 256 - 512liquid-dsp: 32 - 256 - 512liquid-dsp: 64 - 256 - 512srsran: Downlink Processor Benchmarksrsran: PUSCH Processor Benchmark, Throughput Totalsrsran: PUSCH Processor Benchmark, Throughput Threadapache-iotdb: 100 - 1 - 200apache-iotdb: 100 - 1 - 200apache-iotdb: 100 - 1 - 500apache-iotdb: 100 - 1 - 500apache-iotdb: 200 - 1 - 200apache-iotdb: 200 - 1 - 200apache-iotdb: 200 - 1 - 500apache-iotdb: 200 - 1 - 500apache-iotdb: 500 - 1 - 200apache-iotdb: 500 - 1 - 200apache-iotdb: 500 - 1 - 500apache-iotdb: 500 - 1 - 500apache-iotdb: 100 - 100 - 200apache-iotdb: 100 - 100 - 200apache-iotdb: 100 - 100 - 500apache-iotdb: 100 - 100 - 500apache-iotdb: 200 - 100 - 200apache-iotdb: 200 - 100 - 200apache-iotdb: 200 - 100 - 500apache-iotdb: 200 - 100 - 500apache-iotdb: 500 - 100 - 200apache-iotdb: 500 - 100 - 200apache-iotdb: 500 - 100 - 500apache-iotdb: 500 - 100 - 500memtier-benchmark: Redis - 50 - 1:5memtier-benchmark: Redis - 100 - 1:5memtier-benchmark: Redis - 50 - 1:10memtier-benchmark: Redis - 100 - 1:10cassandra: Writesab46.635740.743874.4734157.86775.0892121.794177.78440.0833.8879.61211.85577252.32861.2876.9042216.86235.186131.65685.739876.029972.560978.829172.2893207.244149.935149.825137.536141.4138.961364.426392.397338.930476.611043.9665287.268300.276390.8735837711.853669281.692647.811541676.36549.9415147444.51133.8350240.0999373474.319740.5789918.21136846.01294.261529665.98582724.631537111.2064111.1162126446.21160653.44151386.3122028.03331416.5210587.489599.937176.19167204.2124947.141745029.272572801.7534197705.6358243.3826067360.60696.655852281.7146668634.5311460.78181074.821814.8600390.907640.9109121.6893131.4497479.787633.32783227.09544.9416208.897576.559735.1529453.4802478.910833.3894208.847176.5807295.829554.067446.3307345.1491504.611431.6750137.3780116.376133.9358468.804627.780827.421327.508627.96521467.707331144.69646615.9907441.905263.154323.85642.35140.438445.38547.15127.7864.07493.45159.945.84211.02016.10030.946557945000848435000847085000132810000015773000001728850000243940000383555000513135000705.85372.9240.4710382.4414.581191500.8828.271045806.8111.861505080.3426.291576432.259.491916642.922.9743074031.8431.8359041436.6469.0854224351.129.5445677447.24101.2556894390.6131.5867607191.6468.342211638.652285996.172316281.262447092.0115562649.523040.664874.7148164.04774.9286122.460176.92444.6839.9758.91225.05583978.14856.1477.0345217.19234.874130.98285.485075.300172.539178.960572.1981206.217151.803154.053137.740141.19338.675762.297490.985138.518276.604144.0064285.761300.855392.0836852791.123671617.972648.811492979.46549.5515192892.59132.6150243.4899251227.289326.0989966.29136709.81294.661503623.79598173.561885833.1164118.8761651485.43156668.43151431.1522106.49331423.0410601.109605.307180.43167202.0725282.311750003.432571092.6934050669.2358232.7026125214.84696.925854201.7834.5539460.75881075.957114.8473391.912540.8061122.0367131.0664480.522333.27813233.95884.9312208.990876.468437.3253428.6695479.224133.3680211.227075.7218299.927753.329146.5491343.5170505.130931.6488143.4387111.497634.5447460.670727.840527.389027.397827.94871767.563163144.93674615.4601842.006262.884319.85242.38240.451445.38047.22127.7664.01493.615.91710.99216.24930.927558655000862195000847675000132390000015768500001733700000248820000378650000513040000710.95543.7236.3697217.5514.981185338.0228.451042859.0312.181469808.8926.641521587.49.872009050.4621.6334191814.8643.8656018457.8773.5651199962.1131.6346726912.4698.8756137174.731.6965935725.6768.012217192.122227152.022293467.622304730.19OpenBenchmarking.org

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: double - X Y Z: 128ba1122334455SE +/- 3.39, N = 2SE +/- 0.26, N = 249.5246.641. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: double - X Y Z: 512ba918273645SE +/- 0.00, N = 2SE +/- 0.05, N = 240.6640.741. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512ba20406080100SE +/- 0.16, N = 2SE +/- 0.48, N = 274.7174.471. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: float - X Y Z: 256ba4080120160200SE +/- 3.21, N = 2SE +/- 6.51, N = 2164.05157.871. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: float - X Y Z: 256ba20406080100SE +/- 0.10, N = 2SE +/- 0.48, N = 274.9375.091. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128ba306090120150SE +/- 1.22, N = 2SE +/- 0.56, N = 2122.46121.791. (CXX) g++ options: -O3

Laghos

Laghos (LAGrangian High-Order Solver) is a miniapp that solves the time-dependent Euler equations of compressible gas dynamics in a moving Lagrangian frame using unstructured high-order finite element spatial discretization and explicit high-order time-stepping. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMajor Kernels Total Rate, More Is BetterLaghos 3.1Test: Triple Point Problemba4080120160200SE +/- 0.02, N = 2SE +/- 0.13, N = 2176.92177.781. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi

libxsmm

Libxsmm is an open-source library for specialized dense and sparse matrix operations and deep learning primitives. Libxsmm supports making use of Intel AMX, AVX-512, and other modern CPU instruction set capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 32ba100200300400500SE +/- 0.15, N = 2SE +/- 0.25, N = 2444.6440.01. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 64ba2004006008001000SE +/- 0.20, N = 2SE +/- 1.05, N = 2839.9833.81. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 256ba2004006008001000SE +/- 5.75, N = 2SE +/- 0.65, N = 2758.9879.61. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 128ba30060090012001500SE +/- 1.10, N = 2SE +/- 4.60, N = 21225.01211.81. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

Stress-NG

Stress-NG is a Linux stress tool developed by Colin Ian King. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Hashba1.2M2.4M3.6M4.8M6MSE +/- 2865.25, N = 2SE +/- 3166.95, N = 25583978.145577252.321. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: MMAPba2004006008001000SE +/- 2.06, N = 2SE +/- 3.32, N = 2856.14861.281. (CXX) g++ options: -O2 -std=gnu99 -lc

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: double - X Y Z: 256ba20406080100SE +/- 0.65, N = 2SE +/- 0.40, N = 277.0376.901. (CXX) g++ options: -O3

Laghos

Laghos (LAGrangian High-Order Solver) is a miniapp that solves the time-dependent Euler equations of compressible gas dynamics in a moving Lagrangian frame using unstructured high-order finite element spatial discretization and explicit high-order time-stepping. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMajor Kernels Total Rate, More Is BetterLaghos 3.1Test: Sedov Blast Wave, ube_922_hex.meshba50100150200250SE +/- 0.18, N = 2SE +/- 0.24, N = 2217.19216.861. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi

Palabos

The Palabos library is a framework for general purpose Computational Fluid Dynamics (CFD). Palabos uses a kernel based on the Lattice Boltzmann method. This test profile uses the Palabos MPI-based Cavity3D benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMega Site Updates Per Second, More Is BetterPalabos 2.3Grid Size: 100ba50100150200250SE +/- 0.34, N = 2SE +/- 0.02, N = 2234.87235.191. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128ba306090120150SE +/- 0.61, N = 2SE +/- 0.77, N = 2130.98131.661. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: float - X Y Z: 128ba20406080100SE +/- 0.88, N = 2SE +/- 1.30, N = 285.4985.741. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256ba20406080100SE +/- 0.08, N = 2SE +/- 0.70, N = 275.3076.031. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: float - X Y Z: 512ba1632486480SE +/- 0.00, N = 2SE +/- 0.21, N = 272.5472.561. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512ba20406080100SE +/- 0.06, N = 2SE +/- 0.36, N = 278.9678.831. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256ba1632486480SE +/- 0.12, N = 2SE +/- 0.44, N = 272.2072.291. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128ba50100150200250SE +/- 0.19, N = 2SE +/- 0.61, N = 2206.22207.241. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: float - X Y Z: 128ba306090120150SE +/- 1.24, N = 2SE +/- 1.93, N = 2151.80149.941. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256ba306090120150SE +/- 1.59, N = 2SE +/- 3.76, N = 2154.05149.831. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: float - X Y Z: 512ba306090120150SE +/- 0.33, N = 2SE +/- 0.00, N = 2137.74137.541. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512ba306090120150SE +/- 0.20, N = 2SE +/- 0.63, N = 2141.19141.411. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: double - X Y Z: 256ba918273645SE +/- 0.07, N = 2SE +/- 0.07, N = 238.6838.961. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128ba1428425670SE +/- 2.41, N = 2SE +/- 2.73, N = 262.3064.431. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: double - X Y Z: 128ba20406080100SE +/- 0.09, N = 2SE +/- 0.90, N = 290.9992.401. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256ba918273645SE +/- 0.16, N = 2SE +/- 0.25, N = 238.5238.931. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: double - X Y Z: 512ba20406080100SE +/- 0.11, N = 2SE +/- 0.01, N = 276.6076.611. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512ba1020304050SE +/- 0.02, N = 2SE +/- 0.04, N = 244.0143.971. (CXX) g++ options: -O3

Palabos

The Palabos library is a framework for general purpose Computational Fluid Dynamics (CFD). Palabos uses a kernel based on the Lattice Boltzmann method. This test profile uses the Palabos MPI-based Cavity3D benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMega Site Updates Per Second, More Is BetterPalabos 2.3Grid Size: 400ba60120180240300SE +/- 1.54, N = 2SE +/- 0.49, N = 2285.76287.271. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm

OpenBenchmarking.orgMega Site Updates Per Second, More Is BetterPalabos 2.3Grid Size: 500ba70140210280350SE +/- 1.17, N = 2SE +/- 1.63, N = 2300.86300.281. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm

Stress-NG

Stress-NG is a Linux stress tool developed by Colin Ian King. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: NUMAba90180270360450SE +/- 0.05, N = 2SE +/- 0.88, N = 2392.08390.871. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Pipeba8M16M24M32M40MSE +/- 79631.10, N = 2SE +/- 1105250.10, N = 236852791.1235837711.851. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Pollba800K1600K2400K3200K4000KSE +/- 1953.54, N = 2SE +/- 2536.76, N = 23671617.973669281.691. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Zlibba6001200180024003000SE +/- 0.65, N = 2SE +/- 0.06, N = 22648.812647.811. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Futexba300K600K900K1200K1500KSE +/- 45385.58, N = 2SE +/- 56630.43, N = 21492979.461541676.361. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: MEMFDba120240360480600SE +/- 1.20, N = 2SE +/- 1.31, N = 2549.55549.941. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Mutexba3M6M9M12M15MSE +/- 2864.48, N = 2SE +/- 23940.47, N = 215192892.5915147444.511. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Atomicba306090120150SE +/- 0.20, N = 2SE +/- 1.05, N = 2132.61133.831. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Cryptoba11K22K33K44K55KSE +/- 18.13, N = 2SE +/- 3.65, N = 250243.4850240.091. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Mallocba20M40M60M80M100MSE +/- 83929.32, N = 2SE +/- 129754.02, N = 299251227.2899373474.311. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Cloningba2K4K6K8K10KSE +/- 100.16, N = 2SE +/- 114.33, N = 29326.099740.571. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Forkingba20K40K60K80K100KSE +/- 421.24, N = 2SE +/- 469.20, N = 289966.2989918.211. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Pthreadba30K60K90K120K150KSE +/- 102.07, N = 2SE +/- 971.78, N = 2136709.81136846.011. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: AVL Treeba60120180240300SE +/- 0.85, N = 2SE +/- 0.32, N = 2294.66294.261. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: IO_uringba300K600K900K1200K1500KSE +/- 5229.94, N = 2SE +/- 22482.34, N = 21503623.791529665.981. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: SENDFILEba130K260K390K520K650KSE +/- 243.97, N = 2SE +/- 6799.74, N = 2598173.56582724.631. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: CPU Cacheba400K800K1200K1600K2000KSE +/- 234949.06, N = 2SE +/- 31294.95, N = 21885833.111537111.201. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: CPU Stressba14K28K42K56K70KSE +/- 38.95, N = 2SE +/- 12.73, N = 264118.8764111.111. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Semaphoresba13M26M39M52M65MSE +/- 466593.23, N = 2SE +/- 2077286.42, N = 261651485.4362126446.211. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Matrix Mathba30K60K90K120K150KSE +/- 332.46, N = 2SE +/- 2867.57, N = 2156668.43160653.441. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Vector Mathba30K60K90K120K150KSE +/- 5.98, N = 2SE +/- 47.16, N = 2151431.15151386.311. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Function Callba5K10K15K20K25KSE +/- 74.09, N = 2SE +/- 80.03, N = 222106.4922028.031. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: x86_64 RdRandba70K140K210K280K350KSE +/- 1.14, N = 2SE +/- 2.35, N = 2331423.04331416.521. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Floating Pointba2K4K6K8K10KSE +/- 17.77, N = 2SE +/- 1.07, N = 210601.1010587.481. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Matrix 3D Mathba2K4K6K8K10KSE +/- 4.08, N = 2SE +/- 34.45, N = 29605.309599.931. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Memory Copyingba15003000450060007500SE +/- 11.04, N = 2SE +/- 8.71, N = 27180.437176.191. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Vector Shuffleba40K80K120K160K200KSE +/- 6.04, N = 2SE +/- 6.63, N = 2167202.07167204.211. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Socket Activityba5K10K15K20K25KSE +/- 267.39, N = 2SE +/- 72.57, N = 225282.3124947.141. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Wide Vector Mathba400K800K1200K1600K2000KSE +/- 4139.63, N = 2SE +/- 918.08, N = 21750003.431745029.271. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Context Switchingba600K1200K1800K2400K3000KSE +/- 604.17, N = 2SE +/- 678.57, N = 22571092.692572801.751. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Fused Multiply-Addba7M14M21M28M35MSE +/- 285.63, N = 2SE +/- 137631.48, N = 234050669.2334197705.631. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Vector Floating Pointba12K24K36K48K60KSE +/- 4.11, N = 2SE +/- 30.71, N = 258232.7058243.381. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Glibc C String Functionsba6M12M18M24M30MSE +/- 69329.81, N = 2SE +/- 150617.25, N = 226125214.8426067360.601. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Glibc Qsort Data Sortingba150300450600750SE +/- 0.46, N = 2SE +/- 0.40, N = 2696.92696.651. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: System V Message Passingba1.3M2.6M3.9M5.2M6.5MSE +/- 9802.94, N = 2SE +/- 7174.98, N = 25854201.785852281.711. (CXX) g++ options: -O2 -std=gnu99 -lc

BRL-CAD

BRL-CAD is a cross-platform, open-source solid modeling system with built-in benchmark mode. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgVGR Performance Metric, More Is BetterBRL-CAD 7.36VGR Performance Metrica100K200K300K400K500KSE +/- 3768.50, N = 24666861. (CXX) g++ options: -std=c++14 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -ltcl8.6 -lregex_brl -lz_brl -lnetpbm -ldl -lm -ltk8.6

Neural Magic DeepSparse

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Streamba816243240SE +/- 0.12, N = 2SE +/- 0.06, N = 234.5534.53

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Streamba100200300400500SE +/- 2.44, N = 2SE +/- 0.42, N = 2460.76460.78

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Streamba2004006008001000SE +/- 1.01, N = 2SE +/- 0.57, N = 21075.961074.82

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Streamba48121620SE +/- 0.01, N = 2SE +/- 0.01, N = 214.8514.86

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Streamba90180270360450SE +/- 0.12, N = 2SE +/- 1.01, N = 2391.91390.91

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Streamba918273645SE +/- 0.01, N = 2SE +/- 0.11, N = 240.8140.91

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Streamba306090120150SE +/- 0.22, N = 2SE +/- 0.05, N = 2122.04121.69

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Streamba306090120150SE +/- 0.22, N = 2SE +/- 0.05, N = 2131.07131.45

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Streamba100200300400500SE +/- 0.54, N = 2SE +/- 0.12, N = 2480.52479.79

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Streamba816243240SE +/- 0.04, N = 2SE +/- 0.01, N = 233.2833.33

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Streamba7001400210028003500SE +/- 3.51, N = 2SE +/- 8.40, N = 23233.963227.10

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Streamba1.11192.22383.33574.44765.5595SE +/- 0.0056, N = 2SE +/- 0.0128, N = 24.93124.9416

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Streamba50100150200250SE +/- 0.05, N = 2SE +/- 0.10, N = 2208.99208.90

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Streamba20406080100SE +/- 0.04, N = 2SE +/- 0.03, N = 276.4776.56

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Streamba918273645SE +/- 0.38, N = 2SE +/- 0.01, N = 237.3335.15

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Streamba100200300400500SE +/- 4.41, N = 2SE +/- 0.26, N = 2428.67453.48

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Streamba100200300400500SE +/- 0.02, N = 2SE +/- 0.05, N = 2479.22478.91

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Streamba816243240SE +/- 0.00, N = 2SE +/- 0.00, N = 233.3733.39

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Streamba50100150200250SE +/- 0.12, N = 2SE +/- 0.34, N = 2211.23208.85

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Streamba20406080100SE +/- 0.04, N = 2SE +/- 0.13, N = 275.7276.58

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Streamba70140210280350SE +/- 0.51, N = 2SE +/- 0.05, N = 2299.93295.83

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Streamba1224364860SE +/- 0.09, N = 2SE +/- 0.01, N = 253.3354.07

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Streamba1122334455SE +/- 0.20, N = 2SE +/- 0.02, N = 246.5546.33

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Streamba80160240320400SE +/- 1.63, N = 2SE +/- 0.15, N = 2343.52345.15

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Streamba110220330440550SE +/- 0.12, N = 2SE +/- 0.18, N = 2505.13504.61

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Streamba714212835SE +/- 0.01, N = 2SE +/- 0.01, N = 231.6531.68

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Streamba306090120150SE +/- 0.80, N = 2SE +/- 4.10, N = 2143.44137.38

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Streamba306090120150SE +/- 0.59, N = 2SE +/- 3.45, N = 2111.50116.38

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Streamba816243240SE +/- 0.03, N = 2SE +/- 0.07, N = 234.5433.94

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Streamba100200300400500SE +/- 0.20, N = 2SE +/- 1.46, N = 2460.67468.80

High Performance Conjugate Gradient

HPCG is the High Performance Conjugate Gradient and is a new scientific benchmark from Sandia National Lans focused for super-computer testing with modern real-world workloads compared to HPCC. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 104 104 104 - RT: 60ba714212835SE +/- 0.01, N = 2SE +/- 0.03, N = 227.8427.781. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 144 144 144 - RT: 60ba612182430SE +/- 0.06, N = 2SE +/- 0.01, N = 227.3927.421. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 160 160 160 - RT: 60ba612182430SE +/- 0.07, N = 2SE +/- 0.03, N = 227.4027.511. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

OpenFOAM

OpenFOAM is the leading free, open-source software for computational fluid dynamics (CFD). This test profile currently uses the drivaerFastback test case for analyzing automotive aerodynamics or alternatively the older motorBike input. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Small Mesh Size - Mesh Timeba714212835SE +/- 0.05, N = 2SE +/- 0.02, N = 227.9527.971. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Small Mesh Size - Execution Timeba1530456075SE +/- 0.11, N = 2SE +/- 0.09, N = 267.5667.711. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Medium Mesh Size - Mesh Timeba306090120150SE +/- 0.08, N = 2SE +/- 0.01, N = 2144.94144.701. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Medium Mesh Size - Execution Timeba130260390520650SE +/- 0.03, N = 2SE +/- 0.42, N = 2615.46615.991. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

Timed GDB GNU Debugger Compilation

This test times how long it takes to build the GNU Debugger (GDB) in a default configuration. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed GDB GNU Debugger Compilation 10.2Time To Compileba1020304050SE +/- 0.12, N = 2SE +/- 0.06, N = 242.0141.91

Timed LLVM Compilation

This test times how long it takes to compile/build the LLVM compiler stack. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 16.0Build System: Ninjaba60120180240300SE +/- 0.15, N = 2SE +/- 0.15, N = 2262.88263.15

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 16.0Build System: Unix Makefilesba70140210280350SE +/- 5.88, N = 2SE +/- 5.08, N = 2319.85323.86

Timed PHP Compilation

This test times how long it takes to build PHP. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed PHP Compilation 8.1.9Time To Compileba1020304050SE +/- 0.48, N = 2SE +/- 0.34, N = 242.3842.35

Timed Linux Kernel Compilation

This test times how long it takes to build the Linux kernel in a default configuration (defconfig) for the architecture being tested or alternatively an allmodconfig for building all possible kernel modules for the build. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 6.1Build: defconfigba918273645SE +/- 0.69, N = 2SE +/- 0.72, N = 240.4540.44

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 6.1Build: allmodconfigba100200300400500SE +/- 1.13, N = 2SE +/- 1.46, N = 2445.38445.39

Blender

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.6Blend File: BMW27 - Compute: CPU-Onlyba1122334455SE +/- 0.08, N = 2SE +/- 0.02, N = 247.2247.15

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.6Blend File: Classroom - Compute: CPU-Onlyba306090120150SE +/- 0.13, N = 2SE +/- 0.05, N = 2127.76127.78

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.6Blend File: Fishy Cat - Compute: CPU-Onlyba1428425670SE +/- 0.20, N = 2SE +/- 0.08, N = 264.0164.07

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.6Blend File: Barbershop - Compute: CPU-Onlyba110220330440550SE +/- 0.42, N = 2SE +/- 0.22, N = 2493.61493.45

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.6Blend File: Pabellon Barcelona - Compute: CPU-Onlya4080120160200SE +/- 0.04, N = 2159.94

VVenC

VVenC is the Fraunhofer Versatile Video Encoder as a fast/efficient H.266/VVC encoder. The vvenc encoder makes use of SIMD Everywhere (SIMDe). The vvenc software is published under the Clear BSD License. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterVVenC 1.9Video Input: Bosphorus 4K - Video Preset: Fastba1.33132.66263.99395.32526.6565SE +/- 0.015, N = 2SE +/- 0.074, N = 25.9175.8421. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto

OpenBenchmarking.orgFrames Per Second, More Is BetterVVenC 1.9Video Input: Bosphorus 4K - Video Preset: Fasterba3691215SE +/- 0.03, N = 2SE +/- 0.00, N = 210.9911.021. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto

OpenBenchmarking.orgFrames Per Second, More Is BetterVVenC 1.9Video Input: Bosphorus 1080p - Video Preset: Fastba48121620SE +/- 0.02, N = 2SE +/- 0.17, N = 216.2516.101. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto

OpenBenchmarking.orgFrames Per Second, More Is BetterVVenC 1.9Video Input: Bosphorus 1080p - Video Preset: Fasterba714212835SE +/- 0.04, N = 2SE +/- 0.06, N = 230.9330.951. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto

Liquid-DSP

LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 16 - Buffer Length: 256 - Filter Length: 32ba120M240M360M480M600MSE +/- 605000.00, N = 2SE +/- 2065000.00, N = 25586550005579450001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 16 - Buffer Length: 256 - Filter Length: 57ba200M400M600M800M1000MSE +/- 695000.00, N = 2SE +/- 14365000.00, N = 28621950008484350001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 32 - Buffer Length: 256 - Filter Length: 32ba200M400M600M800M1000MSE +/- 85000.00, N = 2SE +/- 25000.00, N = 28476750008470850001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 32 - Buffer Length: 256 - Filter Length: 57ba300M600M900M1200M1500MSE +/- 4400000.00, N = 2SE +/- 300000.00, N = 2132390000013281000001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 32ba300M600M900M1200M1500MSE +/- 450000.00, N = 2SE +/- 300000.00, N = 2157685000015773000001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 57ba400M800M1200M1600M2000MSE +/- 900000.00, N = 2SE +/- 550000.00, N = 2173370000017288500001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 16 - Buffer Length: 256 - Filter Length: 512ba50M100M150M200M250MSE +/- 3170000.00, N = 2SE +/- 1950000.00, N = 22488200002439400001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 32 - Buffer Length: 256 - Filter Length: 512ba80M160M240M320M400MSE +/- 4920000.00, N = 2SE +/- 1955000.00, N = 23786500003835550001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 512ba110M220M330M440M550MSE +/- 800000.00, N = 2SE +/- 385000.00, N = 25130400005131350001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

srsRAN Project

srsRAN Project is a complete ORAN-native 5G RAN solution created by Software Radio Systems (SRS). The srsRAN Project radio suite was formerly known as srsLTE and can be used for building your own software-defined radio (SDR) 4G/5G mobile network. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMbps, More Is BettersrsRAN Project 23.5Test: Downlink Processor Benchmarkba150300450600750SE +/- 1.60, N = 2SE +/- 5.15, N = 2710.9705.81. (CXX) g++ options: -march=native -mfma -O3 -fno-trapping-math -fno-math-errno -lgtest

OpenBenchmarking.orgMbps, More Is BettersrsRAN Project 23.5Test: PUSCH Processor Benchmark, Throughput Totalba12002400360048006000SE +/- 95.40, N = 2SE +/- 143.30, N = 25543.75372.91. (CXX) g++ options: -march=native -mfma -O3 -fno-trapping-math -fno-math-errno -lgtest

OpenBenchmarking.orgMbps, More Is BettersrsRAN Project 23.5Test: PUSCH Processor Benchmark, Throughput Threadba50100150200250SE +/- 0.10, N = 2SE +/- 3.55, N = 2236.3240.41. (CXX) g++ options: -march=native -mfma -O3 -fno-trapping-math -fno-math-errno -lgtest

Apache IoTDB

OpenBenchmarking.orgpoint/sec, More Is BetterApache IoTDB 1.1.2Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 200ba150K300K450K600K750K697217.55710382.44

OpenBenchmarking.orgAverage Latency, More Is BetterApache IoTDB 1.1.2Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 200ba4812162014.9814.58MAX: 612.21MAX: 679.89

OpenBenchmarking.orgpoint/sec, More Is BetterApache IoTDB 1.1.2Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 500ba300K600K900K1200K1500K1185338.021191500.88

OpenBenchmarking.orgAverage Latency, More Is BetterApache IoTDB 1.1.2Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 500ba71421283528.4528.27MAX: 664.29MAX: 671.77

OpenBenchmarking.orgpoint/sec, More Is BetterApache IoTDB 1.1.2Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 200ba200K400K600K800K1000K1042859.031045806.81

OpenBenchmarking.orgAverage Latency, More Is BetterApache IoTDB 1.1.2Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 200ba369121512.1811.86MAX: 586.62MAX: 573.1

OpenBenchmarking.orgpoint/sec, More Is BetterApache IoTDB 1.1.2Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 500ba300K600K900K1200K1500K1469808.891505080.34

OpenBenchmarking.orgAverage Latency, More Is BetterApache IoTDB 1.1.2Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 500ba61218243026.6426.29MAX: 636.93MAX: 620.79

OpenBenchmarking.orgpoint/sec, More Is BetterApache IoTDB 1.1.2Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 200ba300K600K900K1200K1500K1521587.401576432.25

OpenBenchmarking.orgAverage Latency, More Is BetterApache IoTDB 1.1.2Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 200ba36912159.879.49MAX: 820.85MAX: 845.95

OpenBenchmarking.orgpoint/sec, More Is BetterApache IoTDB 1.1.2Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 500ba400K800K1200K1600K2000K2009050.461916642.90

OpenBenchmarking.orgAverage Latency, More Is BetterApache IoTDB 1.1.2Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 500ba61218243021.6322.97MAX: 867.44MAX: 864.74

OpenBenchmarking.orgpoint/sec, More Is BetterApache IoTDB 1.1.2Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 200ba9M18M27M36M45M34191814.8643074031.84

OpenBenchmarking.orgAverage Latency, More Is BetterApache IoTDB 1.1.2Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 200ba102030405043.8631.83MAX: 2550.76MAX: 790.74

OpenBenchmarking.orgpoint/sec, More Is BetterApache IoTDB 1.1.2Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 500ba13M26M39M52M65M56018457.8759041436.64

OpenBenchmarking.orgAverage Latency, More Is BetterApache IoTDB 1.1.2Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 500ba163248648073.5669.08MAX: 1309.93MAX: 1049.85

OpenBenchmarking.orgpoint/sec, More Is BetterApache IoTDB 1.1.2Device Count: 200 - Batch Size Per Write: 100 - Sensor Count: 200ba12M24M36M48M60M51199962.1154224351.10

OpenBenchmarking.orgAverage Latency, More Is BetterApache IoTDB 1.1.2Device Count: 200 - Batch Size Per Write: 100 - Sensor Count: 200ba71421283531.6329.54MAX: 718.08MAX: 746.57

OpenBenchmarking.orgpoint/sec, More Is BetterApache IoTDB 1.1.2Device Count: 200 - Batch Size Per Write: 100 - Sensor Count: 500ba10M20M30M40M50M46726912.4645677447.24

OpenBenchmarking.orgAverage Latency, More Is BetterApache IoTDB 1.1.2Device Count: 200 - Batch Size Per Write: 100 - Sensor Count: 500ba2040608010098.87101.25MAX: 3564.64MAX: 3631.89

OpenBenchmarking.orgpoint/sec, More Is BetterApache IoTDB 1.1.2Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 200ba12M24M36M48M60M56137174.7056894390.61

OpenBenchmarking.orgAverage Latency, More Is BetterApache IoTDB 1.1.2Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 200ba71421283531.6931.58MAX: 1610.79MAX: 1920.32

OpenBenchmarking.orgpoint/sec, More Is BetterApache IoTDB 1.1.2Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 500ba14M28M42M56M70M65935725.6767607191.64

OpenBenchmarking.orgAverage Latency, More Is BetterApache IoTDB 1.1.2Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 500ba153045607568.0168.34MAX: 1606.75MAX: 2006.68

Redis 7.0.12 + memtier_benchmark

Memtier_benchmark is a NoSQL Redis/Memcache traffic generation plus benchmarking tool developed by Redis Labs. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgOps/sec, More Is BetterRedis 7.0.12 + memtier_benchmark 2.0Protocol: Redis - Clients: 50 - Set To Get Ratio: 1:5ba500K1000K1500K2000K2500KSE +/- 39004.04, N = 2SE +/- 31848.80, N = 22217192.122211638.651. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre

OpenBenchmarking.orgOps/sec, More Is BetterRedis 7.0.12 + memtier_benchmark 2.0Protocol: Redis - Clients: 100 - Set To Get Ratio: 1:5ba500K1000K1500K2000K2500KSE +/- 3990.38, N = 2SE +/- 6000.63, N = 22227152.022285996.171. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre

OpenBenchmarking.orgOps/sec, More Is BetterRedis 7.0.12 + memtier_benchmark 2.0Protocol: Redis - Clients: 50 - Set To Get Ratio: 1:10ba500K1000K1500K2000K2500KSE +/- 4548.93, N = 2SE +/- 13610.76, N = 22293467.622316281.261. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre

Protocol: Redis - Clients: 500 - Set To Get Ratio: 1:5

a: The test run did not produce a result.

b: The test run did not produce a result.

OpenBenchmarking.orgOps/sec, More Is BetterRedis 7.0.12 + memtier_benchmark 2.0Protocol: Redis - Clients: 100 - Set To Get Ratio: 1:10ba500K1000K1500K2000K2500KSE +/- 12975.09, N = 2SE +/- 114392.77, N = 22304730.192447092.011. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre

Protocol: Redis - Clients: 500 - Set To Get Ratio: 1:10

a: The test run did not produce a result.

b: The test run did not produce a result.

Apache Cassandra

This is a benchmark of the Apache Cassandra NoSQL database management system making use of cassandra-stress. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgOp/s, More Is BetterApache Cassandra 4.1.3Test: Writesa30K60K90K120K150KSE +/- 803.50, N = 2155626

164 Results Shown

HeFFTe - Highly Efficient FFT for Exascale:
  c2c - Stock - double - 128
  c2c - Stock - double - 512
  r2c - FFTW - double - 512
  r2c - Stock - float - 256
  c2c - Stock - float - 256
  r2c - FFTW - double - 128
Laghos
libxsmm:
  32
  64
  256
  128
Stress-NG:
  Hash
  MMAP
HeFFTe - Highly Efficient FFT for Exascale
Laghos
Palabos
HeFFTe - Highly Efficient FFT for Exascale:
  c2c - FFTW - float - 128
  c2c - Stock - float - 128
  c2c - FFTW - float - 256
  c2c - Stock - float - 512
  c2c - FFTW - float - 512
  r2c - FFTW - double - 256
  r2c - FFTW - float - 128
  r2c - Stock - float - 128
  r2c - FFTW - float - 256
  r2c - Stock - float - 512
  r2c - FFTW - float - 512
  c2c - Stock - double - 256
  c2c - FFTW - double - 128
  r2c - Stock - double - 128
  c2c - FFTW - double - 256
  r2c - Stock - double - 512
  c2c - FFTW - double - 512
Palabos:
  400
  500
Stress-NG:
  NUMA
  Pipe
  Poll
  Zlib
  Futex
  MEMFD
  Mutex
  Atomic
  Crypto
  Malloc
  Cloning
  Forking
  Pthread
  AVL Tree
  IO_uring
  SENDFILE
  CPU Cache
  CPU Stress
  Semaphores
  Matrix Math
  Vector Math
  Function Call
  x86_64 RdRand
  Floating Point
  Matrix 3D Math
  Memory Copying
  Vector Shuffle
  Socket Activity
  Wide Vector Math
  Context Switching
  Fused Multiply-Add
  Vector Floating Point
  Glibc C String Functions
  Glibc Qsort Data Sorting
  System V Message Passing
BRL-CAD
Neural Magic DeepSparse:
  NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream:
    items/sec
    ms/batch
  NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream:
    items/sec
    ms/batch
  NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream:
    items/sec
    ms/batch
  NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream:
    items/sec
    ms/batch
  ResNet-50, Baseline - Asynchronous Multi-Stream:
    items/sec
    ms/batch
  ResNet-50, Sparse INT8 - Asynchronous Multi-Stream:
    items/sec
    ms/batch
  CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream:
    items/sec
    ms/batch
  BERT-Large, NLP Question Answering - Asynchronous Multi-Stream:
    items/sec
    ms/batch
  CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream:
    items/sec
    ms/batch
  CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream:
    items/sec
    ms/batch
  NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream:
    items/sec
    ms/batch
  CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream:
    items/sec
    ms/batch
  BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream:
    items/sec
    ms/batch
  NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream:
    items/sec
    ms/batch
  NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream:
    items/sec
    ms/batch
High Performance Conjugate Gradient:
  104 104 104 - 60
  144 144 144 - 60
  160 160 160 - 60
OpenFOAM:
  drivaerFastback, Small Mesh Size - Mesh Time
  drivaerFastback, Small Mesh Size - Execution Time
  drivaerFastback, Medium Mesh Size - Mesh Time
  drivaerFastback, Medium Mesh Size - Execution Time
Timed GDB GNU Debugger Compilation
Timed LLVM Compilation:
  Ninja
  Unix Makefiles
Timed PHP Compilation
Timed Linux Kernel Compilation:
  defconfig
  allmodconfig
Blender:
  BMW27 - CPU-Only
  Classroom - CPU-Only
  Fishy Cat - CPU-Only
  Barbershop - CPU-Only
  Pabellon Barcelona - CPU-Only
VVenC:
  Bosphorus 4K - Fast
  Bosphorus 4K - Faster
  Bosphorus 1080p - Fast
  Bosphorus 1080p - Faster
Liquid-DSP:
  16 - 256 - 32
  16 - 256 - 57
  32 - 256 - 32
  32 - 256 - 57
  64 - 256 - 32
  64 - 256 - 57
  16 - 256 - 512
  32 - 256 - 512
  64 - 256 - 512
srsRAN Project:
  Downlink Processor Benchmark
  PUSCH Processor Benchmark, Throughput Total
  PUSCH Processor Benchmark, Throughput Thread
Apache IoTDB:
  100 - 1 - 200:
    point/sec
    Average Latency
  100 - 1 - 500:
    point/sec
    Average Latency
  200 - 1 - 200:
    point/sec
    Average Latency
  200 - 1 - 500:
    point/sec
    Average Latency
  500 - 1 - 200:
    point/sec
    Average Latency
  500 - 1 - 500:
    point/sec
    Average Latency
  100 - 100 - 200:
    point/sec
    Average Latency
  100 - 100 - 500:
    point/sec
    Average Latency
  200 - 100 - 200:
    point/sec
    Average Latency
  200 - 100 - 500:
    point/sec
    Average Latency
  500 - 100 - 200:
    point/sec
    Average Latency
  500 - 100 - 500:
    point/sec
    Average Latency
Redis 7.0.12 + memtier_benchmark:
  Redis - 50 - 1:5
  Redis - 100 - 1:5
  Redis - 50 - 1:10
  Redis - 100 - 1:10
Apache Cassandra