Gigabyte G242-P36 Ampere Altra Max Server

Benchmarks by Michael Larabel.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2401172-NE-GIGABYTEG53
Jump To Table - Results

Statistics

Remove Outliers Before Calculating Averages

Graph Settings

Prefer Vertical Bar Graphs
No Box Plots
On Line Graphs With Missing Data, Connect The Line Gaps

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs
Condense Test Profiles With Multiple Version Results Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
G242-P36
January 16
  19 Hours, 11 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


Gigabyte G242-P36 Ampere Altra Max ServerOpenBenchmarking.orgPhoronix Test SuiteARMv8 Neoverse-N1 @ 3.00GHz (128 Cores)GIGABYTE G242-P36-00 MP32-AR2-00 v01000100 (F31k SCPAmpere Computing LLC Altra PCI Root Complex A16 x 32 GB DDR4-3200MT/s Samsung M393A4K40DB3-CWE800GB Micron_7450_MTFDKBA800TFSASPEEDVGA HDMI2 x Intel I350Ubuntu 23.106.5.0-13-generic (aarch64)GCC 13.2.0ext41920x1080ProcessorMotherboardChipsetMemoryDiskGraphicsMonitorNetworkOSKernelCompilerFile-SystemScreen ResolutionGigabyte G242-P36 Ampere Altra Max Server BenchmarksSystem Logs- Transparent Huge Pages: madvise- --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v - Scaling Governor: cppc_cpufreq performance (Boost: Disabled)- Python 3.11.6- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected

Gigabyte G242-P36 Ampere Altra Max Serverpytorch: CPU - 1 - ResNet-50pytorch: CPU - 1 - ResNet-152pytorch: CPU - 1 - Efficientnet_v2_lpytorch: CPU - 16 - ResNet-50pytorch: CPU - 16 - ResNet-152stress-ng: CPU Stressstress-ng: Cryptostress-ng: Memory Copyingstress-ng: Glibc Qsort Data Sortingstress-ng: Glibc C String Functionsstress-ng: Vector Mathstress-ng: Matrix Mathstress-ng: Forkingstress-ng: System V Message Passingstress-ng: Semaphoresstress-ng: Socket Activitystress-ng: Context Switchingstress-ng: Atomicstress-ng: CPU Cachestress-ng: Mallocstress-ng: MEMFDstress-ng: MMAPstress-ng: NUMAstress-ng: SENDFILEstress-ng: IO_uringstress-ng: Futexstress-ng: Mutexstress-ng: Function Callstress-ng: Pollstress-ng: Hashstress-ng: Pthreadstress-ng: Zlibstress-ng: Floating Pointstress-ng: Fused Multiply-Addstress-ng: Pipestress-ng: Matrix 3D Mathstress-ng: AVL Treestress-ng: Vector Floating Pointstress-ng: Vector Shufflestress-ng: Wide Vector Mathstress-ng: Cloningstress-ng: AVX-512 VNNIstress-ng: Mixed Scheduleropenssl: SHA256openssl: SHA512openssl: AES-128-GCMopenssl: AES-256-GCMopenssl: ChaCha20openssl: ChaCha20-Poly1305minife: Smallquicksilver: CORAL2 P1quicksilver: CORAL2 P2quicksilver: CTS2amg: mt-dgemm: Sustained Floating-Point Ratexmrig: Monero - 1Mxmrig: Wownero - 1Mdeepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Streamdeepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Streamdeepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Streamdeepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Streamdeepsparse: ResNet-50, Baseline - Asynchronous Multi-Streamdeepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Streamdeepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Streamcachebench: Readcachebench: Writecachebench: Read / Modify / Writecompress-7zip: Compression Ratingcompress-7zip: Decompression Ratinglczero: BLASlczero: Eigenstockfish: Total Timegromacs: MPI CPU - water_GMX50_barespeedb: Seq Fillspeedb: Rand Fillspeedb: Rand Fill Syncspeedb: Rand Readspeedb: Read While Writingspeedb: Read Rand Write Randspeedb: Update Randrocksdb: Rand Readrocksdb: Read While Writingrocksdb: Read Rand Write Randrocksdb: Update Randopenssl: RSA4096llama-cpp: llama-2-7b.Q4_0.ggufllama-cpp: llama-2-13b.Q4_0.ggufllama-cpp: llama-2-70b-chat.Q5_0.ggufopenssl: RSA4096deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Streamdeepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Streamdeepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Streamdeepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Streamdeepsparse: ResNet-50, Baseline - Asynchronous Multi-Streamdeepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Streamdeepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Streambuild-linux-kernel: defconfigbuild-linux-kernel: allmodconfigbuild-llvm: Ninjabuild-llvm: Unix MakefilesG242-P361.910.680.301.830.6733761.08252315.2627153.742020.1862783286.48398869.87681885.3052250.5321143237.72167637763.5928009.0720365273.287.29879814.35164364343.39574.851088.771419.061624492.92604943.76343012.7537172432.6672283.187330369.9615671801.48113551.875987.8822213.54151220570.5130330081.185099.81299.50102535.3586218.952346519.637795.964690386.6436794.331013229617533447876959038268820730030648784268016173222607011221344884023996.0252733332554333316203333105706433317.7849834201.71935.21137.781339.9765477.814133.6229200.0280202.227933.747345.6788476.37812677.070847.0250430.137511438.27651638239.97073045034.97615633331653764762481886531774.588295079284987207376409571625129050352419683272275434052355855884533203374314066342.821.5813.903.07517886.055.5703185.3571132.10321834.5799314.1452310.84031830.57601358.1773132.304723.50041320.1354146.746278.703308.297266.333411.521OpenBenchmarking.org

CPU Temperature Monitor

OpenBenchmarking.orgCelsiusCPU Temperature MonitorPhoronix Test Suite System MonitoringG242-P361632486480Min: 35 / Avg: 61.54 / Max: 84

CPU Peak Freq (Highest CPU Core Frequency) Monitor

OpenBenchmarking.orgMegahertzCPU Peak Freq (Highest CPU Core Frequency) MonitorPhoronix Test Suite System MonitoringG242-P3660012001800240030003000

CPU Power Consumption Monitor

OpenBenchmarking.orgWattsCPU Power Consumption MonitorPhoronix Test Suite System MonitoringG242-P3650100150200250Min: 21.48 / Avg: 119.26 / Max: 284.79

System Power Consumption Monitor

OpenBenchmarking.orgWattsSystem Power Consumption MonitorPhoronix Test Suite System MonitoringG242-P36120240360480600Min: 121 / Avg: 261.83 / Max: 673

PyTorch

This is a benchmark of PyTorch making use of pytorch-benchmark [https://github.com/LukasHedegaard/pytorch-benchmark]. Currently this test profile is catered to CPU-based testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 1 - Model: ResNet-50G242-P360.42980.85961.28941.71922.149SE +/- 0.00, N = 31.91MIN: 1.8 / MAX: 2.09

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 1 - Model: ResNet-152G242-P360.1530.3060.4590.6120.765SE +/- 0.00, N = 30.68MIN: 0.65 / MAX: 0.7

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_lG242-P360.06750.1350.20250.270.3375SE +/- 0.00, N = 30.30MIN: 0.27 / MAX: 0.4

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 16 - Model: ResNet-50G242-P360.41180.82361.23541.64722.059SE +/- 0.02, N = 51.83MIN: 1.7 / MAX: 2.02

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 16 - Model: ResNet-152G242-P360.15080.30160.45240.60320.754SE +/- 0.00, N = 20.67MIN: 0.65 / MAX: 0.7

Stress-NG

Stress-NG is a Linux stress tool developed by Colin Ian King. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: CPU StressG242-P367K14K21K28K35KSE +/- 1.60, N = 333761.081. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: CryptoG242-P3650K100K150K200K250KSE +/- 928.63, N = 3252315.261. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Memory CopyingG242-P366K12K18K24K30KSE +/- 1.16, N = 327153.741. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Glibc Qsort Data SortingG242-P36400800120016002000SE +/- 0.78, N = 32020.181. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Glibc C String FunctionsG242-P3613M26M39M52M65MSE +/- 17918.08, N = 362783286.481. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Vector MathG242-P3690K180K270K360K450KSE +/- 4.53, N = 3398869.871. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Matrix MathG242-P36150K300K450K600K750KSE +/- 404.39, N = 3681885.301. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: ForkingG242-P3611K22K33K44K55KSE +/- 410.62, N = 352250.531. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: System V Message PassingG242-P365M10M15M20M25MSE +/- 32907.24, N = 321143237.721. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: SemaphoresG242-P3640M80M120M160M200MSE +/- 217685.76, N = 3167637763.591. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Socket ActivityG242-P366K12K18K24K30KSE +/- 159.43, N = 328009.071. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Context SwitchingG242-P364M8M12M16M20MSE +/- 174052.70, N = 1520365273.281. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: AtomicG242-P36246810SE +/- 0.59, N = 157.291. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: CPU CacheG242-P36200K400K600K800K1000KSE +/- 1033.74, N = 3879814.351. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: MallocG242-P3640M80M120M160M200MSE +/- 296218.44, N = 3164364343.391. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: MEMFDG242-P36120240360480600SE +/- 4.82, N = 8574.851. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: MMAPG242-P362004006008001000SE +/- 5.43, N = 31088.771. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: NUMAG242-P3630060090012001500SE +/- 2.47, N = 31419.061. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: SENDFILEG242-P36300K600K900K1200K1500KSE +/- 18.53, N = 31624492.921. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: IO_uringG242-P36130K260K390K520K650KSE +/- 5192.48, N = 3604943.761. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: FutexG242-P3670K140K210K280K350KSE +/- 7072.24, N = 15343012.751. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: MutexG242-P368M16M24M32M40MSE +/- 9463.26, N = 337172432.661. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Function CallG242-P3615K30K45K60K75KSE +/- 1.53, N = 372283.181. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: PollG242-P361.6M3.2M4.8M6.4M8MSE +/- 12697.25, N = 37330369.961. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: HashG242-P363M6M9M12M15MSE +/- 9429.94, N = 315671801.481. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: PthreadG242-P3620K40K60K80K100KSE +/- 65.20, N = 3113551.871. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: ZlibG242-P3613002600390052006500SE +/- 0.87, N = 35987.881. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Floating PointG242-P365K10K15K20K25KSE +/- 0.42, N = 322213.541. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Fused Multiply-AddG242-P3630M60M90M120M150MSE +/- 110268.18, N = 3151220570.511. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: PipeG242-P366M12M18M24M30MSE +/- 95784.06, N = 330330081.181. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Matrix 3D MathG242-P3611002200330044005500SE +/- 3.74, N = 35099.811. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: AVL TreeG242-P3670140210280350SE +/- 0.16, N = 3299.501. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Vector Floating PointG242-P3620K40K60K80K100KSE +/- 25.89, N = 3102535.351. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Vector ShuffleG242-P3620K40K60K80K100KSE +/- 3.20, N = 386218.951. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Wide Vector MathG242-P36500K1000K1500K2000K2500KSE +/- 6960.54, N = 32346519.631. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: CloningG242-P362K4K6K8K10KSE +/- 29.21, N = 37795.961. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: AVX-512 VNNIG242-P361000K2000K3000K4000K5000KSE +/- 401.84, N = 34690386.641. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Mixed SchedulerG242-P368K16K24K32K40KSE +/- 141.59, N = 336794.331. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenSSL

OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: SHA256G242-P3620000M40000M60000M80000M100000MSE +/- 64411674.99, N = 31013229617531. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: SHA512G242-P367000M14000M21000M28000M35000MSE +/- 8688088.34, N = 3344787695901. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: AES-128-GCMG242-P3680000M160000M240000M320000M400000MSE +/- 3586455.40, N = 33826882073001. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: AES-256-GCMG242-P3670000M140000M210000M280000M350000MSE +/- 40660594.45, N = 33064878426801. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: ChaCha20G242-P3630000M60000M90000M120000M150000MSE +/- 10001054.79, N = 31617322260701. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: ChaCha20-Poly1305G242-P3620000M40000M60000M80000M100000MSE +/- 361309.16, N = 31122134488401. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

miniFE

MiniFE Finite Element is an application for unstructured implicit finite element codes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgCG Mflops, More Is BetterminiFE 2.2Problem Size: SmallG242-P365K10K15K20K25KSE +/- 14.30, N = 423996.01. (CXX) g++ options: -O3 -fopenmp -lmpi_cxx -lmpi

Quicksilver

Quicksilver is a proxy application that represents some elements of the Mercury workload by solving a simplified dynamic Monte Carlo particle transport problem. Quicksilver is developed by Lawrence Livermore National Laboratory (LLNL) and this test profile currently makes use of the OpenMP CPU threaded code path. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFigure Of Merit, More Is BetterQuicksilver 20230818Input: CORAL2 P1G242-P365M10M15M20M25MSE +/- 81103.50, N = 3252733331. (CXX) g++ options: -fopenmp -O3 -march=native

OpenBenchmarking.orgFigure Of Merit, More Is BetterQuicksilver 20230818Input: CORAL2 P2G242-P365M10M15M20M25MSE +/- 84129.53, N = 3255433331. (CXX) g++ options: -fopenmp -O3 -march=native

OpenBenchmarking.orgFigure Of Merit, More Is BetterQuicksilver 20230818Input: CTS2G242-P363M6M9M12M15MSE +/- 42557.15, N = 3162033331. (CXX) g++ options: -fopenmp -O3 -march=native

Algebraic Multi-Grid Benchmark

AMG is a parallel algebraic multigrid solver for linear systems arising from problems on unstructured grids. The driver provided with AMG builds linear systems for various 3-dimensional problems. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFigure Of Merit, More Is BetterAlgebraic Multi-Grid Benchmark 1.2G242-P36200M400M600M800M1000MSE +/- 47484.50, N = 310570643331. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -lmpi

ACES DGEMM

This is a multi-threaded DGEMM benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterACES DGEMM 1.0Sustained Floating-Point RateG242-P3648121620SE +/- 0.09, N = 417.781. (CC) gcc options: -O3 -march=native -fopenmp

Xmrig

Xmrig is an open-source cross-platform CPU/GPU miner for RandomX, KawPow, CryptoNight and AstroBWT. This test profile is setup to measure the Xmrig CPU mining performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterXmrig 6.21Variant: Monero - Hash Count: 1MG242-P369001800270036004500SE +/- 17.55, N = 34201.71. (CXX) g++ options: -fexceptions -fno-rtti -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc

OpenBenchmarking.orgH/s, More Is BetterXmrig 6.21Variant: Wownero - Hash Count: 1MG242-P36400800120016002000SE +/- 2.92, N = 31935.21. (CXX) g++ options: -fexceptions -fno-rtti -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc

Neural Magic DeepSparse

This is a benchmark of Neural Magic's DeepSparse using its built-in deepsparse.benchmark utility and various models from their SparseZoo (https://sparsezoo.neuralmagic.com/). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-StreamG242-P362004006008001000SE +/- 1.48, N = 31137.78

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-StreamG242-P3670140210280350SE +/- 0.19, N = 3339.98

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-StreamG242-P36100200300400500SE +/- 0.36, N = 3477.81

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-StreamG242-P36816243240SE +/- 0.04, N = 333.62

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-StreamG242-P364080120160200SE +/- 0.52, N = 3200.03

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-StreamG242-P364080120160200SE +/- 0.53, N = 3202.23

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-StreamG242-P36816243240SE +/- 0.08, N = 333.75

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-StreamG242-P361020304050SE +/- 0.32, N = 345.68

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-StreamG242-P36100200300400500SE +/- 0.42, N = 3476.38

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-StreamG242-P366001200180024003000SE +/- 25.24, N = 32677.07

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-StreamG242-P361122334455SE +/- 0.03, N = 347.03

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-StreamG242-P3690180270360450SE +/- 4.70, N = 3430.14

CacheBench

This is a performance test of CacheBench, which is part of LLCbench. CacheBench is designed to test the memory and cache bandwidth performance Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMB/s, More Is BetterCacheBenchTest: ReadG242-P362K4K6K8K10KSE +/- 0.01, N = 311438.28MIN: 11437.32 / MAX: 11438.591. (CC) gcc options: -O3 -lrt

OpenBenchmarking.orgMB/s, More Is BetterCacheBenchTest: WriteG242-P368K16K24K32K40KSE +/- 1.22, N = 338239.97MIN: 35288.52 / MAX: 413821. (CC) gcc options: -O3 -lrt

OpenBenchmarking.orgMB/s, More Is BetterCacheBenchTest: Read / Modify / WriteG242-P3610K20K30K40K50KSE +/- 2.04, N = 345034.98MIN: 43692.22 / MAX: 45639.261. (CC) gcc options: -O3 -lrt

Neural Magic DeepSparse

OpenBenchmarking.orgMegahertz, More Is BetterNeural Magic DeepSparse 1.6CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

OpenBenchmarking.orgMegahertz, More Is BetterNeural Magic DeepSparse 1.6CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

OpenBenchmarking.orgMegahertz, More Is BetterNeural Magic DeepSparse 1.6CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

OpenBenchmarking.orgMegahertz, More Is BetterNeural Magic DeepSparse 1.6CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

OpenBenchmarking.orgMegahertz, More Is BetterNeural Magic DeepSparse 1.6CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

OpenBenchmarking.orgMegahertz, More Is BetterNeural Magic DeepSparse 1.6CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

OpenBenchmarking.orgMegahertz, More Is BetterNeural Magic DeepSparse 1.6CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

OpenBenchmarking.orgMegahertz, More Is BetterNeural Magic DeepSparse 1.6CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

OpenBenchmarking.orgMegahertz, More Is BetterNeural Magic DeepSparse 1.6CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

OpenBenchmarking.orgMegahertz, More Is BetterNeural Magic DeepSparse 1.6CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

OpenBenchmarking.orgMegahertz, More Is BetterNeural Magic DeepSparse 1.6CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

LeelaChessZero

OpenBenchmarking.orgMegahertz, More Is BetterLeelaChessZero 0.30CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

OpenBenchmarking.orgMegahertz, More Is BetterLeelaChessZero 0.30CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

Llama.cpp

OpenBenchmarking.orgMegahertz, More Is BetterLlama.cpp b1808CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

OpenBenchmarking.orgMegahertz, More Is BetterLlama.cpp b1808CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

OpenBenchmarking.orgMegahertz, More Is BetterLlama.cpp b1808CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

Speedb

OpenBenchmarking.orgMegahertz, More Is BetterSpeedb 2.7CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

OpenBenchmarking.orgMegahertz, More Is BetterSpeedb 2.7CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

OpenBenchmarking.orgMegahertz, More Is BetterSpeedb 2.7CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

OpenBenchmarking.orgMegahertz, More Is BetterSpeedb 2.7CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

OpenBenchmarking.orgMegahertz, More Is BetterSpeedb 2.7CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

OpenBenchmarking.orgMegahertz, More Is BetterSpeedb 2.7CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

OpenBenchmarking.orgMegahertz, More Is BetterSpeedb 2.7CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

Timed Linux Kernel Compilation

OpenBenchmarking.orgMegahertz, More Is BetterTimed Linux Kernel Compilation 6.1CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

OpenBenchmarking.orgMegahertz, More Is BetterTimed Linux Kernel Compilation 6.1CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

Timed LLVM Compilation

OpenBenchmarking.orgMegahertz, More Is BetterTimed LLVM Compilation 16.0CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

OpenBenchmarking.orgMegahertz, More Is BetterTimed LLVM Compilation 16.0CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

7-Zip Compression

OpenBenchmarking.orgMegahertz, More Is Better7-Zip Compression 22.01CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

RocksDB

OpenBenchmarking.orgMegahertz, More Is BetterRocksDB 8.0CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

OpenBenchmarking.orgMegahertz, More Is BetterRocksDB 8.0CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

OpenBenchmarking.orgMegahertz, More Is BetterRocksDB 8.0CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

OpenBenchmarking.orgMegahertz, More Is BetterRocksDB 8.0CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

Stockfish

OpenBenchmarking.orgMegahertz, More Is BetterStockfish 15CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

OpenSSL

OpenBenchmarking.orgMegahertz, More Is BetterOpenSSL 3.1CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

GROMACS

OpenBenchmarking.orgMegahertz, More Is BetterGROMACS 2023CPU Peak Freq (Highest CPU Core Frequency) MonitorG242-P3660012001800240030003000

7-Zip Compression

This is a test of 7-Zip compression/decompression with its integrated benchmark feature. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 22.01Test: Compression RatingG242-P3670K140K210K280K350KSE +/- 991.66, N = 33333161. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 22.01Test: Decompression RatingG242-P36120K240K360K480K600KSE +/- 396.38, N = 35376471. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

LeelaChessZero

LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.30Backend: BLASG242-P361428425670SE +/- 0.58, N = 3621. (CXX) g++ options: -flto -pthread

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.30Backend: EigenG242-P361122334455SE +/- 0.33, N = 3481. (CXX) g++ options: -flto -pthread

Stockfish

This is a test of Stockfish, an advanced open-source C++11 chess benchmark that can scale up to 512 CPU threads. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 15Total TimeG242-P3640M80M120M160M200MSE +/- 6857171.33, N = 151886531771. (CXX) g++ options: -lgcov -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -flto -flto=jobserver

GROMACS

The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2023Implementation: MPI CPU - Input: water_GMX50_bareG242-P361.03232.06463.09694.12925.1615SE +/- 0.002, N = 34.5881. (CXX) g++ options: -O3

Speedb

Speedb is a next-generation key value storage engine that is RocksDB compatible and aiming for stability, efficiency, and performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Sequential FillG242-P3660K120K180K240K300KSE +/- 3101.60, N = 52950791. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Random FillG242-P3660K120K180K240K300KSE +/- 1985.22, N = 32849871. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Random Fill SyncG242-P3640K80K120K160K200KSE +/- 1986.97, N = 32073761. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Random ReadG242-P3690M180M270M360M450MSE +/- 2947408.87, N = 114095716251. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Read While WritingG242-P363M6M9M12M15MSE +/- 201662.23, N = 15129050351. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Read Random Write RandomG242-P36500K1000K1500K2000K2500KSE +/- 21596.32, N = 324196831. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Update RandomG242-P3660K120K180K240K300KSE +/- 1573.56, N = 32722751. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

RocksDB

This is a benchmark of Meta/Facebook's RocksDB as an embeddable persistent key-value store for fast storage based on Google's LevelDB. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgOp/s, More Is BetterRocksDB 8.0Test: Random ReadG242-P3690M180M270M360M450MSE +/- 4162622.50, N = 154340523551. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

OpenBenchmarking.orgOp/s, More Is BetterRocksDB 8.0Test: Read While WritingG242-P362M4M6M8M10MSE +/- 68677.29, N = 985588451. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

OpenBenchmarking.orgOp/s, More Is BetterRocksDB 8.0Test: Read Random Write RandomG242-P36700K1400K2100K2800K3500KSE +/- 30568.75, N = 733203371. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

OpenBenchmarking.orgOp/s, More Is BetterRocksDB 8.0Test: Update RandomG242-P3690K180K270K360K450KSE +/- 4409.44, N = 34314061. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

OpenSSL

OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgsign/s, More Is BetterOpenSSL 3.1Algorithm: RSA4096G242-P3614002800420056007000SE +/- 0.10, N = 36342.81. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

Llama.cpp

Llama.cpp is a port of Facebook's LLaMA model in C/C++ developed by Georgi Gerganov. Llama.cpp allows the inference of LLaMA and other supported models in C/C++. For CPU inference Llama.cpp supports AVX2/AVX-512, ARM NEON, and other modern ISAs along with features like OpenBLAS usage. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b1808Model: llama-2-7b.Q4_0.ggufG242-P36510152025SE +/- 0.21, N = 621.581. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -lopenblas

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b1808Model: llama-2-13b.Q4_0.ggufG242-P3648121620SE +/- 0.16, N = 1513.901. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -lopenblas

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b1808Model: llama-2-70b-chat.Q5_0.ggufG242-P360.69081.38162.07242.76323.454SE +/- 0.03, N = 83.071. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -lopenblas

OpenSSL

OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgverify/s, More Is BetterOpenSSL 3.1Algorithm: RSA4096G242-P36110K220K330K440K550KSE +/- 27.21, N = 3517886.01. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

Neural Magic DeepSparse

MinAvgMaxG242-P3639.061.376.0OpenBenchmarking.orgCelsius, Fewer Is BetterNeural Magic DeepSparse 1.6CPU Temperature Monitor20406080100

MinAvgMaxG242-P3649.068.078.0OpenBenchmarking.orgCelsius, Fewer Is BetterNeural Magic DeepSparse 1.6CPU Temperature Monitor20406080100

MinAvgMaxG242-P3646.069.683.0OpenBenchmarking.orgCelsius, Fewer Is BetterNeural Magic DeepSparse 1.6CPU Temperature Monitor20406080100

MinAvgMaxG242-P3650.068.282.0OpenBenchmarking.orgCelsius, Fewer Is BetterNeural Magic DeepSparse 1.6CPU Temperature Monitor20406080100

MinAvgMaxG242-P3645.068.981.0OpenBenchmarking.orgCelsius, Fewer Is BetterNeural Magic DeepSparse 1.6CPU Temperature Monitor20406080100

MinAvgMaxG242-P3650.069.184.0OpenBenchmarking.orgCelsius, Fewer Is BetterNeural Magic DeepSparse 1.6CPU Temperature Monitor20406080100

MinAvgMaxG242-P3645.066.481.0OpenBenchmarking.orgCelsius, Fewer Is BetterNeural Magic DeepSparse 1.6CPU Temperature Monitor20406080100

MinAvgMaxG242-P3646.064.978.0OpenBenchmarking.orgCelsius, Fewer Is BetterNeural Magic DeepSparse 1.6CPU Temperature Monitor20406080100

MinAvgMaxG242-P3651.065.775.0OpenBenchmarking.orgCelsius, Fewer Is BetterNeural Magic DeepSparse 1.6CPU Temperature Monitor20406080100

MinAvgMaxG242-P3647.058.882.0OpenBenchmarking.orgCelsius, Fewer Is BetterNeural Magic DeepSparse 1.6CPU Temperature Monitor20406080100

MinAvgMaxG242-P3647.062.775.0OpenBenchmarking.orgCelsius, Fewer Is BetterNeural Magic DeepSparse 1.6CPU Temperature Monitor20406080100

Timed Linux Kernel Compilation

MinAvgMaxG242-P3648.060.668.0OpenBenchmarking.orgCelsius, Fewer Is BetterTimed Linux Kernel Compilation 6.1CPU Temperature Monitor20406080100

MinAvgMaxG242-P3649.065.470.0OpenBenchmarking.orgCelsius, Fewer Is BetterTimed Linux Kernel Compilation 6.1CPU Temperature Monitor20406080100

Timed LLVM Compilation

MinAvgMaxG242-P3647.059.969.0OpenBenchmarking.orgCelsius, Fewer Is BetterTimed LLVM Compilation 16.0CPU Temperature Monitor20406080100

MinAvgMaxG242-P3646.057.468.0OpenBenchmarking.orgCelsius, Fewer Is BetterTimed LLVM Compilation 16.0CPU Temperature Monitor20406080100

Neural Magic DeepSparse

This is a benchmark of Neural Magic's DeepSparse using its built-in deepsparse.benchmark utility and various models from their SparseZoo (https://sparsezoo.neuralmagic.com/). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-StreamG242-P361224364860SE +/- 0.09, N = 355.57

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-StreamG242-P364080120160200SE +/- 0.10, N = 3185.36

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-StreamG242-P36306090120150SE +/- 0.15, N = 3132.10

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-StreamG242-P36400800120016002000SE +/- 1.29, N = 31834.58

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-StreamG242-P3670140210280350SE +/- 0.64, N = 3314.15

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-StreamG242-P3670140210280350SE +/- 0.86, N = 3310.84

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-StreamG242-P36400800120016002000SE +/- 0.45, N = 31830.58

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-StreamG242-P3630060090012001500SE +/- 8.37, N = 31358.18

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-StreamG242-P36306090120150SE +/- 0.18, N = 3132.30

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-StreamG242-P36612182430SE +/- 0.21, N = 323.50

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-StreamG242-P3630060090012001500SE +/- 0.85, N = 31320.14

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-StreamG242-P36306090120150SE +/- 1.58, N = 3146.75

Timed Linux Kernel Compilation

This test times how long it takes to build the Linux kernel in a default configuration (defconfig) for the architecture being tested or alternatively an allmodconfig for building all possible kernel modules for the build. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 6.1Build: defconfigG242-P3620406080100SE +/- 0.82, N = 378.70

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 6.1Build: allmodconfigG242-P3670140210280350SE +/- 1.01, N = 3308.30

Timed LLVM Compilation

This test times how long it takes to compile/build the LLVM compiler stack. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 16.0Build System: NinjaG242-P3660120180240300SE +/- 0.67, N = 3266.33

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 16.0Build System: Unix MakefilesG242-P3690180270360450SE +/- 1.15, N = 3411.52

164 Results Shown

CPU Temperature Monitor:
  Phoronix Test Suite System Monitoring:
    Celsius
    Megahertz
    Watts
    Watts
PyTorch:
  CPU - 1 - ResNet-50
  CPU - 1 - ResNet-152
  CPU - 1 - Efficientnet_v2_l
  CPU - 16 - ResNet-50
  CPU - 16 - ResNet-152
Stress-NG:
  CPU Stress
  Crypto
  Memory Copying
  Glibc Qsort Data Sorting
  Glibc C String Functions
  Vector Math
  Matrix Math
  Forking
  System V Message Passing
  Semaphores
  Socket Activity
  Context Switching
  Atomic
  CPU Cache
  Malloc
  MEMFD
  MMAP
  NUMA
  SENDFILE
  IO_uring
  Futex
  Mutex
  Function Call
  Poll
  Hash
  Pthread
  Zlib
  Floating Point
  Fused Multiply-Add
  Pipe
  Matrix 3D Math
  AVL Tree
  Vector Floating Point
  Vector Shuffle
  Wide Vector Math
  Cloning
  AVX-512 VNNI
  Mixed Scheduler
OpenSSL:
  SHA256
  SHA512
  AES-128-GCM
  AES-256-GCM
  ChaCha20
  ChaCha20-Poly1305
miniFE
Quicksilver:
  CORAL2 P1
  CORAL2 P2
  CTS2
Algebraic Multi-Grid Benchmark
ACES DGEMM
Xmrig:
  Monero - 1M
  Wownero - 1M
Neural Magic DeepSparse:
  NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream
  NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream
  CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream
  NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream
  CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream
  CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream
  NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream
  CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream
  ResNet-50, Baseline - Asynchronous Multi-Stream
  ResNet-50, Sparse INT8 - Asynchronous Multi-Stream
  BERT-Large, NLP Question Answering - Asynchronous Multi-Stream
  BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream
CacheBench:
  Read
  Write
  Read / Modify / Write
Neural Magic DeepSparse:
  CPU Peak Freq (Highest CPU Core Frequency) Monitor:
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
7-Zip Compression:
  Compression Rating
  Decompression Rating
LeelaChessZero:
  BLAS
  Eigen
Stockfish
GROMACS
Speedb:
  Seq Fill
  Rand Fill
  Rand Fill Sync
  Rand Read
  Read While Writing
  Read Rand Write Rand
  Update Rand
RocksDB:
  Rand Read
  Read While Writing
  Read Rand Write Rand
  Update Rand
OpenSSL
Llama.cpp:
  llama-2-7b.Q4_0.gguf
  llama-2-13b.Q4_0.gguf
  llama-2-70b-chat.Q5_0.gguf
OpenSSL
Neural Magic DeepSparse:
  CPU Temp Monitor:
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
Neural Magic DeepSparse:
  NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream
  NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream
  CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream
  NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream
  CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream
  CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream
  NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream
  CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream
  ResNet-50, Baseline - Asynchronous Multi-Stream
  ResNet-50, Sparse INT8 - Asynchronous Multi-Stream
  BERT-Large, NLP Question Answering - Asynchronous Multi-Stream
  BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream
Timed Linux Kernel Compilation:
  defconfig
  allmodconfig
Timed LLVM Compilation:
  Ninja
  Unix Makefiles