Gigabyte G242-P36 Ampere Altra Max Server

Benchmarks by Michael Larabel for a future article.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2401176-NE-GIGABYTEG67
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results
Show Result Confidence Charts

Limit displaying results to tests within:

BLAS (Basic Linear Algebra Sub-Routine) Tests 2 Tests
Chess Test Suite 2 Tests
Timed Code Compilation 2 Tests
C/C++ Compiler Tests 5 Tests
CPU Massive 8 Tests
Cryptography 2 Tests
HPC - High Performance Computing 8 Tests
Common Kernel Benchmarks 3 Tests
Linear Algebra 2 Tests
Machine Learning 4 Tests
Molecular Dynamics 2 Tests
MPI Benchmarks 2 Tests
Multi-Core 6 Tests
NVIDIA GPU Compute 2 Tests
OpenMPI Tests 3 Tests
Programmer / Developer System Benchmarks 4 Tests
Python Tests 3 Tests
Scientific Computing 4 Tests
Server 3 Tests
Server CPU Tests 6 Tests

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Geometric Means Per-Suite/Category
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
G242-P36
January 16
  19 Hours, 11 Minutes
gig
January 17
  2 Hours, 33 Minutes
dd
January 17
  2 Hours, 24 Minutes
Invert Hiding All Results Option
  8 Hours, 3 Minutes

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


Gigabyte G242-P36 Ampere Altra Max ServerOpenBenchmarking.orgPhoronix Test SuiteARMv8 Neoverse-N1 @ 3.00GHz (128 Cores)GIGABYTE G242-P36-00 MP32-AR2-00 v01000100 (F31k SCPAmpere Computing LLC Altra PCI Root Complex A16 x 32 GB DDR4-3200MT/s Samsung M393A4K40DB3-CWE800GB Micron_7450_MTFDKBA800TFSASPEEDVGA HDMI2 x Intel I350Ubuntu 23.106.5.0-13-generic (aarch64)GCC 13.2.0ext41920x1080ProcessorMotherboardChipsetMemoryDiskGraphicsMonitorNetworkOSKernelCompilerFile-SystemScreen ResolutionGigabyte G242-P36 Ampere Altra Max Server BenchmarksSystem Logs- Transparent Huge Pages: madvise- --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v - Scaling Governor: cppc_cpufreq performance (Boost: Disabled)- Python 3.11.6- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected

G242-P36gigddResult OverviewPhoronix Test Suite100%107%114%121%StockfishLlama.cppLeelaChessZeroQuicksilverRocksDBTimed Linux Kernel CompilationStress-NGTimed LLVM CompilationSpeedbNeural Magic DeepSparse7-Zip CompressionOpenSSLCacheBench

Gigabyte G242-P36 Ampere Altra Max Serverpytorch: CPU - 1 - Efficientnet_v2_lpytorch: CPU - 16 - ResNet-152pytorch: CPU - 16 - ResNet-50pytorch: CPU - 1 - ResNet-152pytorch: CPU - 1 - ResNet-50xmrig: Wownero - 1Mspeedb: Seq Fillxmrig: Monero - 1Mdeepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Streambuild-llvm: Unix Makefileslczero: BLASlczero: Eigendeepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Streamdeepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Streamquicksilver: CTS2build-linux-kernel: allmodconfigstress-ng: Atomicbuild-llvm: Ninjallama-cpp: llama-2-70b-chat.Q5_0.ggufstockfish: Total Timeopenssl: ChaCha20-Poly1305openssl: ChaCha20openssl: AES-256-GCMspeedb: Read While Writingrocksdb: Rand Readquicksilver: CORAL2 P2openssl: AES-128-GCMopenssl: SHA256openssl: SHA512speedb: Rand Readrocksdb: Read While Writingllama-cpp: llama-2-13b.Q4_0.ggufcachebench: Readcachebench: Read / Modify / Writecachebench: Writerocksdb: Read Rand Write Randstress-ng: Futexstress-ng: Context Switchingdeepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Streamdeepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Streamamg: build-linux-kernel: defconfigdeepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Streamgromacs: MPI CPU - water_GMX50_barestress-ng: MEMFDspeedb: Rand Fillspeedb: Rand Fill Syncspeedb: Update Randspeedb: Read Rand Write Randrocksdb: Update Randopenssl: RSA4096openssl: RSA4096deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Streamdeepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Streamdeepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Streamquicksilver: CORAL2 P1deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Streamdeepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Streamdeepsparse: ResNet-50, Baseline - Asynchronous Multi-Streamdeepsparse: ResNet-50, Baseline - Asynchronous Multi-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Streamcompress-7zip: Decompression Ratingcompress-7zip: Compression Ratingstress-ng: IO_uringstress-ng: MMAPstress-ng: Cloningstress-ng: Mallocstress-ng: CPU Cachestress-ng: Pthreadstress-ng: Zlibstress-ng: Vector Shufflestress-ng: Vector Mathstress-ng: Wide Vector Mathstress-ng: Matrix Mathstress-ng: Function Callstress-ng: Matrix 3D Mathstress-ng: CPU Stressstress-ng: AVL Treestress-ng: Cryptostress-ng: Fused Multiply-Addstress-ng: Hashstress-ng: SENDFILEstress-ng: AVX-512 VNNIstress-ng: Glibc Qsort Data Sortingstress-ng: Vector Floating Pointstress-ng: Floating Pointstress-ng: Pollstress-ng: Glibc C String Functionsstress-ng: System V Message Passingstress-ng: Forkingstress-ng: Memory Copyingstress-ng: Semaphoresstress-ng: Mutexstress-ng: Mixed Schedulerstress-ng: NUMAstress-ng: Pipestress-ng: Socket Activityllama-cpp: llama-2-7b.Q4_0.ggufminife: Smallmt-dgemm: Sustained Floating-Point RateG242-P36gigdd0.300.671.830.681.911935.22950794201.723.50042677.0708411.52162481358.177345.678816203333308.2977.29266.3333.07188653177112213448840161732226070306487842680129050354340523552554333338268820730010132296175334478769590409571625855884513.9011438.27651645034.97615638239.9707303320337343012.7520365273.281320.135447.0250105706433378.703146.7462430.13754.588574.852849872073762722752419683431406517886.06342.855.57031137.7811830.576033.74731834.579933.6229310.8403202.227925273333314.1452200.0280185.3571339.9765132.3047476.3781132.1032477.8141537647333316604943.761088.777795.96164364343.39879814.35113551.875987.8886218.95398869.872346519.63681885.3072283.185099.8133761.08299.50252315.26151220570.5115671801.481624492.924690386.642020.18102535.3522213.547330369.9662783286.4821143237.7252250.5327153.74167637763.5937172432.6636794.331419.0630330081.1828009.0721.5823996.017.78498329005924.01252624.7719408.27159471334.543346.727316460000309.4775.64267.863.13177653916112250396400161791663040306544534870132553414505009122552000038285632826010003959375034453399030418448304851606014.0211438.66616145027.47270138251.5919243449038323012.9619654874.851326.958146.5531106013600080.078149.4774421.3454.688576.532782642044102649982518519427908518115.96345.655.42331141.4511830.717433.8691832.115433.5823310.6422202.633225810000315.8962198.9064185.8675339.5239132.3228477.0964133.4484472.0699541204333057612149.931104.197312.78164067515.18882510.28112993.155993.7486375.79398993.462355564.94682490.7572298.235082.6533765.26299.1251986.12151387869.7615654462.921624969.464691697.852022.01102553.1122219.87392099.8262867317.1621054213.7950130.9727162.14167850957.6837215286.0436309.291416.0329805509.1227959.8521.924150.718.2727528576623.42092684.8341407.1960481336.392446.499816430000310.1376.8264.7443.14226859548137855304042918132446000038279302868010132123745034448701700420437471863656314.1111438.86384745041.15485338252.628443537322318037.9320708288.981327.996246.419480.243145.2837433.8593569.362853162078912647482473336443804518085.76345.355.72221135.43651843.239633.15921850.226433.2422311.5325201.755425510000316.9118198.3147183.6437343.7639132.2627477.6899130.6575483.308541552331579583751.831092.256918.49164592319.96882225.34113379.285985.6986257.77399042.092354926.97682554.3372290.815089.1933559.87299.99251996.36151037296.4615654282.581624702.094692452.82020.3102604.7422220.77395099.6462845443.5321119614.3150686.5827159.07166379337.6737267646.9136361.291426.4530776841.7327536.7926.64OpenBenchmarking.org

PyTorch

This is a benchmark of PyTorch making use of pytorch-benchmark [https://github.com/LukasHedegaard/pytorch-benchmark]. Currently this test profile is catered to CPU-based testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_lG242-P360.06750.1350.20250.270.3375SE +/- 0.00, N = 30.30MIN: 0.27 / MAX: 0.4

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 16 - Model: ResNet-152G242-P360.15080.30160.45240.60320.754SE +/- 0.00, N = 20.67MIN: 0.65 / MAX: 0.7

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 16 - Model: ResNet-50G242-P360.41180.82361.23541.64722.059SE +/- 0.02, N = 51.83MIN: 1.7 / MAX: 2.02

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 1 - Model: ResNet-152G242-P360.1530.3060.4590.6120.765SE +/- 0.00, N = 30.68MIN: 0.65 / MAX: 0.7

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 1 - Model: ResNet-50G242-P360.42980.85961.28941.71922.149SE +/- 0.00, N = 31.91MIN: 1.8 / MAX: 2.09

Xmrig

Xmrig is an open-source cross-platform CPU/GPU miner for RandomX, KawPow, CryptoNight and AstroBWT. This test profile is setup to measure the Xmrig CPU mining performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterXmrig 6.21Variant: Wownero - Hash Count: 1MG242-P36400800120016002000SE +/- 2.92, N = 31935.21. (CXX) g++ options: -fexceptions -fno-rtti -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc

Speedb

Speedb is a next-generation key value storage engine that is RocksDB compatible and aiming for stability, efficiency, and performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Sequential FillG242-P36gigdd60K120K180K240K300KSE +/- 3101.60, N = 52950792900592857661. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

Xmrig

Xmrig is an open-source cross-platform CPU/GPU miner for RandomX, KawPow, CryptoNight and AstroBWT. This test profile is setup to measure the Xmrig CPU mining performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterXmrig 6.21Variant: Monero - Hash Count: 1MG242-P369001800270036004500SE +/- 17.55, N = 34201.71. (CXX) g++ options: -fexceptions -fno-rtti -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc

Neural Magic DeepSparse

This is a benchmark of Neural Magic's DeepSparse using its built-in deepsparse.benchmark utility and various models from their SparseZoo (https://sparsezoo.neuralmagic.com/). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-StreamddG242-P36gig612182430SE +/- 0.21, N = 323.4223.5024.01

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-StreamddG242-P36gig6001200180024003000SE +/- 25.24, N = 32684.832677.072624.77

Timed LLVM Compilation

This test times how long it takes to compile/build the LLVM compiler stack. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 16.0Build System: Unix MakefilesddgigG242-P3690180270360450SE +/- 1.15, N = 3407.19408.27411.52

LeelaChessZero

LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.30Backend: BLASG242-P36ddgig1428425670SE +/- 0.58, N = 36260591. (CXX) g++ options: -flto -pthread

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.30Backend: EigenddG242-P36gig1122334455SE +/- 0.33, N = 34848471. (CXX) g++ options: -flto -pthread

Neural Magic DeepSparse

This is a benchmark of Neural Magic's DeepSparse using its built-in deepsparse.benchmark utility and various models from their SparseZoo (https://sparsezoo.neuralmagic.com/). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-StreamgigddG242-P3630060090012001500SE +/- 8.37, N = 31334.541336.391358.18

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-StreamgigddG242-P361122334455SE +/- 0.32, N = 346.7346.5045.68

Quicksilver

Quicksilver is a proxy application that represents some elements of the Mercury workload by solving a simplified dynamic Monte Carlo particle transport problem. Quicksilver is developed by Lawrence Livermore National Laboratory (LLNL) and this test profile currently makes use of the OpenMP CPU threaded code path. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFigure Of Merit, More Is BetterQuicksilver 20230818Input: CTS2gigddG242-P364M8M12M16M20MSE +/- 42557.15, N = 31646000016430000162033331. (CXX) g++ options: -fopenmp -O3 -march=native

Timed Linux Kernel Compilation

This test times how long it takes to build the Linux kernel in a default configuration (defconfig) for the architecture being tested or alternatively an allmodconfig for building all possible kernel modules for the build. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 6.1Build: allmodconfigG242-P36gigdd70140210280350SE +/- 1.01, N = 3308.30309.48310.14

Stress-NG

Stress-NG is a Linux stress tool developed by Colin Ian King. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: AtomicG242-P36ddgig246810SE +/- 0.59, N = 157.296.805.641. (CXX) g++ options: -O2 -std=gnu99 -lc

Timed LLVM Compilation

This test times how long it takes to compile/build the LLVM compiler stack. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 16.0Build System: NinjaddG242-P36gig60120180240300SE +/- 0.67, N = 3264.74266.33267.86

Llama.cpp

Llama.cpp is a port of Facebook's LLaMA model in C/C++ developed by Georgi Gerganov. Llama.cpp allows the inference of LLaMA and other supported models in C/C++. For CPU inference Llama.cpp supports AVX2/AVX-512, ARM NEON, and other modern ISAs along with features like OpenBLAS usage. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b1808Model: llama-2-70b-chat.Q5_0.ggufddgigG242-P360.70651.4132.11952.8263.5325SE +/- 0.03, N = 83.143.133.071. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -lopenblas

Stockfish

This is a test of Stockfish, an advanced open-source C++11 chess benchmark that can scale up to 512 CPU threads. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 15Total TimeddG242-P36gig50M100M150M200M250MSE +/- 6857171.33, N = 152268595481886531771776539161. (CXX) g++ options: -lgcov -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -flto -flto=jobserver

OpenSSL

OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: ChaCha20-Poly1305gigG242-P3620000M40000M60000M80000M100000MSE +/- 361309.16, N = 31122503964001122134488401. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: ChaCha20gigG242-P3630000M60000M90000M120000M150000MSE +/- 10001054.79, N = 31617916630401617322260701. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: AES-256-GCMgigG242-P3670000M140000M210000M280000M350000MSE +/- 40660594.45, N = 33065445348703064878426801. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

Speedb

Speedb is a next-generation key value storage engine that is RocksDB compatible and aiming for stability, efficiency, and performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Read While WritingddgigG242-P363M6M9M12M15MSE +/- 201662.23, N = 151378553013255341129050351. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

RocksDB

This is a benchmark of Meta/Facebook's RocksDB as an embeddable persistent key-value store for fast storage based on Google's LevelDB. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgOp/s, More Is BetterRocksDB 8.0Test: Random ReadgigG242-P36dd100M200M300M400M500MSE +/- 4162622.50, N = 154505009124340523554042918131. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

Quicksilver

Quicksilver is a proxy application that represents some elements of the Mercury workload by solving a simplified dynamic Monte Carlo particle transport problem. Quicksilver is developed by Lawrence Livermore National Laboratory (LLNL) and this test profile currently makes use of the OpenMP CPU threaded code path. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFigure Of Merit, More Is BetterQuicksilver 20230818Input: CORAL2 P2G242-P36gigdd5M10M15M20M25MSE +/- 84129.53, N = 32554333325520000244600001. (CXX) g++ options: -fopenmp -O3 -march=native

OpenSSL

OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: AES-128-GCMgigddG242-P3680000M160000M240000M320000M400000MSE +/- 3586455.40, N = 33828563282603827930286803826882073001. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: SHA256G242-P36ddgig20000M40000M60000M80000M100000MSE +/- 64411674.99, N = 31013229617531013212374501000395937501. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: SHA512G242-P36gigdd7000M14000M21000M28000M35000MSE +/- 8688088.34, N = 33447876959034453399030344487017001. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

Speedb

Speedb is a next-generation key value storage engine that is RocksDB compatible and aiming for stability, efficiency, and performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Random ReadddgigG242-P3690M180M270M360M450MSE +/- 2947408.87, N = 114204374714184483044095716251. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

RocksDB

This is a benchmark of Meta/Facebook's RocksDB as an embeddable persistent key-value store for fast storage based on Google's LevelDB. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgOp/s, More Is BetterRocksDB 8.0Test: Read While WritingddG242-P36gig2M4M6M8M10MSE +/- 68677.29, N = 98636563855884585160601. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

Llama.cpp

Llama.cpp is a port of Facebook's LLaMA model in C/C++ developed by Georgi Gerganov. Llama.cpp allows the inference of LLaMA and other supported models in C/C++. For CPU inference Llama.cpp supports AVX2/AVX-512, ARM NEON, and other modern ISAs along with features like OpenBLAS usage. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b1808Model: llama-2-13b.Q4_0.ggufddgigG242-P3648121620SE +/- 0.16, N = 1514.1114.0213.901. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -lopenblas

CacheBench

This is a performance test of CacheBench, which is part of LLCbench. CacheBench is designed to test the memory and cache bandwidth performance Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMB/s, More Is BetterCacheBenchTest: ReadddgigG242-P362K4K6K8K10KSE +/- 0.01, N = 311438.8611438.6711438.28MIN: 11438.05 / MAX: 11439.05MIN: 11438.33 / MAX: 11438.85MIN: 11437.32 / MAX: 11438.591. (CC) gcc options: -O3 -lrt

OpenBenchmarking.orgMB/s, More Is BetterCacheBenchTest: Read / Modify / WriteddG242-P36gig10K20K30K40K50KSE +/- 2.04, N = 345041.1545034.9845027.47MIN: 43693.38 / MAX: 45647.65MIN: 43692.22 / MAX: 45639.26MIN: 43694.36 / MAX: 45640.071. (CC) gcc options: -O3 -lrt

OpenBenchmarking.orgMB/s, More Is BetterCacheBenchTest: WriteddgigG242-P368K16K24K32K40KSE +/- 1.22, N = 338252.6338251.5938239.97MIN: 35291.37 / MAX: 41384.3MIN: 35289.91 / MAX: 41383.99MIN: 35288.52 / MAX: 413821. (CC) gcc options: -O3 -lrt

RocksDB

This is a benchmark of Meta/Facebook's RocksDB as an embeddable persistent key-value store for fast storage based on Google's LevelDB. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgOp/s, More Is BetterRocksDB 8.0Test: Read Random Write RandomddgigG242-P36800K1600K2400K3200K4000KSE +/- 30568.75, N = 73537322344903833203371. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

Stress-NG

Stress-NG is a Linux stress tool developed by Colin Ian King. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: FutexG242-P36gigdd70K140K210K280K350KSE +/- 7072.24, N = 15343012.75323012.96318037.931. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Context SwitchingddG242-P36gig4M8M12M16M20MSE +/- 174052.70, N = 1520708288.9820365273.2819654874.851. (CXX) g++ options: -O2 -std=gnu99 -lc

Neural Magic DeepSparse

This is a benchmark of Neural Magic's DeepSparse using its built-in deepsparse.benchmark utility and various models from their SparseZoo (https://sparsezoo.neuralmagic.com/). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-StreamG242-P36gigdd30060090012001500SE +/- 0.85, N = 31320.141326.961328.00

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-StreamG242-P36gigdd1122334455SE +/- 0.03, N = 347.0346.5546.42

Algebraic Multi-Grid Benchmark

AMG is a parallel algebraic multigrid solver for linear systems arising from problems on unstructured grids. The driver provided with AMG builds linear systems for various 3-dimensional problems. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFigure Of Merit, More Is BetterAlgebraic Multi-Grid Benchmark 1.2gigG242-P36200M400M600M800M1000MSE +/- 47484.50, N = 3106013600010570643331. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -lmpi

Timed Linux Kernel Compilation

This test times how long it takes to build the Linux kernel in a default configuration (defconfig) for the architecture being tested or alternatively an allmodconfig for building all possible kernel modules for the build. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 6.1Build: defconfigG242-P36gigdd20406080100SE +/- 0.82, N = 378.7080.0880.24

Neural Magic DeepSparse

This is a benchmark of Neural Magic's DeepSparse using its built-in deepsparse.benchmark utility and various models from their SparseZoo (https://sparsezoo.neuralmagic.com/). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-StreamddG242-P36gig306090120150SE +/- 1.58, N = 3145.28146.75149.48

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-StreamddG242-P36gig90180270360450SE +/- 4.70, N = 3433.86430.14421.35

GROMACS

The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2023Implementation: MPI CPU - Input: water_GMX50_baregigG242-P361.05482.10963.16444.21925.274SE +/- 0.002, N = 34.6884.5881. (CXX) g++ options: -O3

Stress-NG

Stress-NG is a Linux stress tool developed by Colin Ian King. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: MEMFDgigG242-P36dd120240360480600SE +/- 4.82, N = 8576.53574.85569.361. (CXX) g++ options: -O2 -std=gnu99 -lc

Speedb

Speedb is a next-generation key value storage engine that is RocksDB compatible and aiming for stability, efficiency, and performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Random FillddG242-P36gig60K120K180K240K300KSE +/- 1985.22, N = 32853162849872782641. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Random Fill SyncddG242-P36gig40K80K120K160K200KSE +/- 1986.97, N = 32078912073762044101. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Update RandomG242-P36gigdd60K120K180K240K300KSE +/- 1573.56, N = 32722752649982647481. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Read Random Write RandomgigddG242-P36500K1000K1500K2000K2500KSE +/- 21596.32, N = 32518519247333624196831. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

RocksDB

This is a benchmark of Meta/Facebook's RocksDB as an embeddable persistent key-value store for fast storage based on Google's LevelDB. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgOp/s, More Is BetterRocksDB 8.0Test: Update RandomddG242-P36gig100K200K300K400K500KSE +/- 4409.44, N = 34438044314064279081. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

OpenSSL

OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgverify/s, More Is BetterOpenSSL 3.1Algorithm: RSA4096gigddG242-P36110K220K330K440K550KSE +/- 27.21, N = 3518115.9518085.7517886.01. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgsign/s, More Is BetterOpenSSL 3.1Algorithm: RSA4096gigddG242-P3614002800420056007000SE +/- 0.10, N = 36345.66345.36342.81. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

Neural Magic DeepSparse

This is a benchmark of Neural Magic's DeepSparse using its built-in deepsparse.benchmark utility and various models from their SparseZoo (https://sparsezoo.neuralmagic.com/). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-StreamgigG242-P36dd1326395265SE +/- 0.09, N = 355.4255.5755.72

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-StreamgigG242-P36dd2004006008001000SE +/- 1.48, N = 31141.451137.781135.44

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-StreamG242-P36gigdd400800120016002000SE +/- 0.45, N = 31830.581830.721843.24

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-StreamgigG242-P36dd816243240SE +/- 0.08, N = 333.8733.7533.16

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-StreamgigG242-P36dd400800120016002000SE +/- 1.29, N = 31832.121834.581850.23

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-StreamG242-P36gigdd816243240SE +/- 0.04, N = 333.6233.5833.24

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-StreamgigG242-P36dd70140210280350SE +/- 0.86, N = 3310.64310.84311.53

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-StreamgigG242-P36dd4080120160200SE +/- 0.53, N = 3202.63202.23201.76

Quicksilver

Quicksilver is a proxy application that represents some elements of the Mercury workload by solving a simplified dynamic Monte Carlo particle transport problem. Quicksilver is developed by Lawrence Livermore National Laboratory (LLNL) and this test profile currently makes use of the OpenMP CPU threaded code path. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFigure Of Merit, More Is BetterQuicksilver 20230818Input: CORAL2 P1gigddG242-P366M12M18M24M30MSE +/- 81103.50, N = 32581000025510000252733331. (CXX) g++ options: -fopenmp -O3 -march=native

Neural Magic DeepSparse

This is a benchmark of Neural Magic's DeepSparse using its built-in deepsparse.benchmark utility and various models from their SparseZoo (https://sparsezoo.neuralmagic.com/). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-StreamG242-P36gigdd70140210280350SE +/- 0.64, N = 3314.15315.90316.91

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-StreamG242-P36gigdd4080120160200SE +/- 0.52, N = 3200.03198.91198.31

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-StreamddG242-P36gig4080120160200SE +/- 0.10, N = 3183.64185.36185.87

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-StreamddG242-P36gig70140210280350SE +/- 0.19, N = 3343.76339.98339.52

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-StreamddG242-P36gig306090120150SE +/- 0.18, N = 3132.26132.30132.32

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-StreamddgigG242-P36100200300400500SE +/- 0.42, N = 3477.69477.10476.38

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-StreamddG242-P36gig306090120150SE +/- 0.15, N = 3130.66132.10133.45

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-StreamddG242-P36gig100200300400500SE +/- 0.36, N = 3483.31477.81472.07

7-Zip Compression

This is a test of 7-Zip compression/decompression with its integrated benchmark feature. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 22.01Test: Decompression RatingddgigG242-P36120K240K360K480K600KSE +/- 396.38, N = 35415525412045376471. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 22.01Test: Compression RatingG242-P36gigdd70K140K210K280K350KSE +/- 991.66, N = 33333163330573315791. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

Stress-NG

Stress-NG is a Linux stress tool developed by Colin Ian King. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: IO_uringgigG242-P36dd130K260K390K520K650KSE +/- 5192.48, N = 3612149.93604943.76583751.831. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: MMAPgigddG242-P362004006008001000SE +/- 5.43, N = 31104.191092.251088.771. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: CloningG242-P36gigdd2K4K6K8K10KSE +/- 29.21, N = 37795.967312.786918.491. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: MallocddG242-P36gig40M80M120M160M200MSE +/- 296218.44, N = 3164592319.96164364343.39164067515.181. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: CPU CachegigddG242-P36200K400K600K800K1000KSE +/- 1033.74, N = 3882510.28882225.34879814.351. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: PthreadG242-P36ddgig20K40K60K80K100KSE +/- 65.20, N = 3113551.87113379.28112993.151. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: ZlibgigG242-P36dd13002600390052006500SE +/- 0.87, N = 35993.745987.885985.691. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Vector ShufflegigddG242-P3620K40K60K80K100KSE +/- 3.20, N = 386375.7986257.7786218.951. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Vector MathddgigG242-P3690K180K270K360K450KSE +/- 4.53, N = 3399042.09398993.46398869.871. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Wide Vector MathgigddG242-P36500K1000K1500K2000K2500KSE +/- 6960.54, N = 32355564.942354926.972346519.631. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Matrix MathddgigG242-P36150K300K450K600K750KSE +/- 404.39, N = 3682554.33682490.75681885.301. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Function CallgigddG242-P3615K30K45K60K75KSE +/- 1.53, N = 372298.2372290.8172283.181. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Matrix 3D MathG242-P36ddgig11002200330044005500SE +/- 3.74, N = 35099.815089.195082.651. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: CPU StressgigG242-P36dd7K14K21K28K35KSE +/- 1.60, N = 333765.2633761.0833559.871. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: AVL TreeddG242-P36gig70140210280350SE +/- 0.16, N = 3299.99299.50299.101. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: CryptoG242-P36ddgig50K100K150K200K250KSE +/- 928.63, N = 3252315.26251996.36251986.121. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Fused Multiply-AddgigG242-P36dd30M60M90M120M150MSE +/- 110268.18, N = 3151387869.76151220570.51151037296.461. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: HashG242-P36gigdd3M6M9M12M15MSE +/- 9429.94, N = 315671801.4815654462.9215654282.581. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: SENDFILEgigddG242-P36300K600K900K1200K1500KSE +/- 18.53, N = 31624969.461624702.091624492.921. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: AVX-512 VNNIddgigG242-P361000K2000K3000K4000K5000KSE +/- 401.84, N = 34692452.804691697.854690386.641. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Glibc Qsort Data SortinggigddG242-P36400800120016002000SE +/- 0.78, N = 32022.012020.302020.181. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Vector Floating PointddgigG242-P3620K40K60K80K100KSE +/- 25.89, N = 3102604.74102553.11102535.351. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Floating PointddgigG242-P365K10K15K20K25KSE +/- 0.42, N = 322220.7022219.8022213.541. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: PollddgigG242-P361.6M3.2M4.8M6.4M8MSE +/- 12697.25, N = 37395099.647392099.827330369.961. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Glibc C String FunctionsgigddG242-P3613M26M39M52M65MSE +/- 17918.08, N = 362867317.1662845443.5362783286.481. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: System V Message PassingG242-P36ddgig5M10M15M20M25MSE +/- 32907.24, N = 321143237.7221119614.3121054213.791. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: ForkingG242-P36ddgig11K22K33K44K55KSE +/- 410.62, N = 352250.5350686.5850130.971. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Memory CopyinggigddG242-P366K12K18K24K30KSE +/- 1.16, N = 327162.1427159.0727153.741. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: SemaphoresgigG242-P36dd40M80M120M160M200MSE +/- 217685.76, N = 3167850957.68167637763.59166379337.671. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: MutexddgigG242-P368M16M24M32M40MSE +/- 9463.26, N = 337267646.9137215286.0437172432.661. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Mixed SchedulerG242-P36ddgig8K16K24K32K40KSE +/- 141.59, N = 336794.3336361.2936309.291. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: NUMAddG242-P36gig30060090012001500SE +/- 2.47, N = 31426.451419.061416.031. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: PipeddG242-P36gig7M14M21M28M35MSE +/- 95784.06, N = 330776841.7330330081.1829805509.121. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Socket ActivityG242-P36gigdd6K12K18K24K30KSE +/- 159.43, N = 328009.0727959.8527536.791. (CXX) g++ options: -O2 -std=gnu99 -lc

Llama.cpp

Llama.cpp is a port of Facebook's LLaMA model in C/C++ developed by Georgi Gerganov. Llama.cpp allows the inference of LLaMA and other supported models in C/C++. For CPU inference Llama.cpp supports AVX2/AVX-512, ARM NEON, and other modern ISAs along with features like OpenBLAS usage. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b1808Model: llama-2-7b.Q4_0.ggufddgigG242-P36612182430SE +/- 0.21, N = 626.6421.9021.581. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -lopenblas

miniFE

MiniFE Finite Element is an application for unstructured implicit finite element codes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgCG Mflops, More Is BetterminiFE 2.2Problem Size: SmallgigG242-P365K10K15K20K25KSE +/- 14.30, N = 424150.723996.01. (CXX) g++ options: -O3 -fopenmp -lmpi_cxx -lmpi

ACES DGEMM

This is a multi-threaded DGEMM benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterACES DGEMM 1.0Sustained Floating-Point RategigG242-P3648121620SE +/- 0.09, N = 418.2717.781. (CC) gcc options: -O3 -march=native -fopenmp

110 Results Shown

PyTorch:
  CPU - 1 - Efficientnet_v2_l
  CPU - 16 - ResNet-152
  CPU - 16 - ResNet-50
  CPU - 1 - ResNet-152
  CPU - 1 - ResNet-50
Xmrig
Speedb
Xmrig
Neural Magic DeepSparse:
  ResNet-50, Sparse INT8 - Asynchronous Multi-Stream:
    ms/batch
    items/sec
Timed LLVM Compilation
LeelaChessZero:
  BLAS
  Eigen
Neural Magic DeepSparse:
  CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream:
    ms/batch
    items/sec
Quicksilver
Timed Linux Kernel Compilation
Stress-NG
Timed LLVM Compilation
Llama.cpp
Stockfish
OpenSSL:
  ChaCha20-Poly1305
  ChaCha20
  AES-256-GCM
Speedb
RocksDB
Quicksilver
OpenSSL:
  AES-128-GCM
  SHA256
  SHA512
Speedb
RocksDB
Llama.cpp
CacheBench:
  Read
  Read / Modify / Write
  Write
RocksDB
Stress-NG:
  Futex
  Context Switching
Neural Magic DeepSparse:
  BERT-Large, NLP Question Answering - Asynchronous Multi-Stream:
    ms/batch
    items/sec
Algebraic Multi-Grid Benchmark
Timed Linux Kernel Compilation
Neural Magic DeepSparse:
  BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream:
    ms/batch
    items/sec
GROMACS
Stress-NG
Speedb:
  Rand Fill
  Rand Fill Sync
  Update Rand
  Read Rand Write Rand
RocksDB
OpenSSL:
  RSA4096:
    verify/s
    sign/s
Neural Magic DeepSparse:
  NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream:
    ms/batch
    items/sec
  NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream:
    ms/batch
    items/sec
  NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream:
    ms/batch
    items/sec
  CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream:
    ms/batch
    items/sec
Quicksilver
Neural Magic DeepSparse:
  CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream:
    ms/batch
    items/sec
  NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream:
    ms/batch
    items/sec
  ResNet-50, Baseline - Asynchronous Multi-Stream:
    ms/batch
    items/sec
  CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream:
    ms/batch
    items/sec
7-Zip Compression:
  Decompression Rating
  Compression Rating
Stress-NG:
  IO_uring
  MMAP
  Cloning
  Malloc
  CPU Cache
  Pthread
  Zlib
  Vector Shuffle
  Vector Math
  Wide Vector Math
  Matrix Math
  Function Call
  Matrix 3D Math
  CPU Stress
  AVL Tree
  Crypto
  Fused Multiply-Add
  Hash
  SENDFILE
  AVX-512 VNNI
  Glibc Qsort Data Sorting
  Vector Floating Point
  Floating Point
  Poll
  Glibc C String Functions
  System V Message Passing
  Forking
  Memory Copying
  Semaphores
  Mutex
  Mixed Scheduler
  NUMA
  Pipe
  Socket Activity
Llama.cpp
miniFE
ACES DGEMM