Gigabyte G242-P36 Ampere Altra Max Server

Benchmarks by Michael Larabel.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2401172-NE-GIGABYTEG53
Jump To Table - Results

Statistics

Remove Outliers Before Calculating Averages

Graph Settings

Prefer Vertical Bar Graphs
No Box Plots
On Line Graphs With Missing Data, Connect The Line Gaps

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs
Condense Test Profiles With Multiple Version Results Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
G242-P36
January 16 2024
  19 Hours, 11 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


Gigabyte G242-P36 Ampere Altra Max ServerOpenBenchmarking.orgPhoronix Test SuiteARMv8 Neoverse-N1 @ 3.00GHz (128 Cores)GIGABYTE G242-P36-00 MP32-AR2-00 v01000100 (F31k SCPAmpere Computing LLC Altra PCI Root Complex A16 x 32 GB DDR4-3200MT/s Samsung M393A4K40DB3-CWE800GB Micron_7450_MTFDKBA800TFSASPEEDVGA HDMI2 x Intel I350Ubuntu 23.106.5.0-13-generic (aarch64)GCC 13.2.0ext41920x1080ProcessorMotherboardChipsetMemoryDiskGraphicsMonitorNetworkOSKernelCompilerFile-SystemScreen ResolutionGigabyte G242-P36 Ampere Altra Max Server BenchmarksSystem Logs- Transparent Huge Pages: madvise- --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v - Scaling Governor: cppc_cpufreq performance (Boost: Disabled)- Python 3.11.6- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected

Gigabyte G242-P36 Ampere Altra Max Serverstress-ng: CPU Stressstress-ng: Cryptostress-ng: Memory Copyingstress-ng: Glibc Qsort Data Sortingstress-ng: Glibc C String Functionsstress-ng: Vector Mathstress-ng: Matrix Mathstress-ng: Forkingstress-ng: System V Message Passingstress-ng: Semaphoresstress-ng: Socket Activitystress-ng: Context Switchingstress-ng: Atomicstress-ng: CPU Cachestress-ng: Mallocstress-ng: MEMFDstress-ng: MMAPstress-ng: NUMAstress-ng: SENDFILEstress-ng: IO_uringstress-ng: Futexstress-ng: Mutexstress-ng: Function Callstress-ng: Pollstress-ng: Hashstress-ng: Pthreadstress-ng: Zlibstress-ng: Floating Pointstress-ng: Fused Multiply-Addstress-ng: Pipestress-ng: Matrix 3D Mathstress-ng: AVL Treestress-ng: Vector Floating Pointstress-ng: Vector Shufflestress-ng: Wide Vector Mathstress-ng: Cloningstress-ng: AVX-512 VNNIstress-ng: Mixed Schedulercachebench: Readcachebench: Writecachebench: Read / Modify / Writexmrig: Monero - 1Mxmrig: Wownero - 1Mlczero: BLASlczero: Eigendeepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Streamquicksilver: CORAL2 P1deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Streamquicksilver: CTS2deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Streamdeepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Streamdeepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Streamquicksilver: CORAL2 P2deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Streamdeepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Streamdeepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Streamdeepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Streamdeepsparse: ResNet-50, Baseline - Asynchronous Multi-Streamdeepsparse: ResNet-50, Baseline - Asynchronous Multi-Streamdeepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Streamdeepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Streamdeepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Streampytorch: CPU - 1 - ResNet-50pytorch: CPU - 1 - ResNet-152pytorch: CPU - 1 - Efficientnet_v2_lpytorch: CPU - 16 - ResNet-50pytorch: CPU - 16 - ResNet-152llama-cpp: llama-2-7b.Q4_0.ggufllama-cpp: llama-2-13b.Q4_0.ggufllama-cpp: llama-2-70b-chat.Q5_0.ggufgromacs: MPI CPU - water_GMX50_baremt-dgemm: Sustained Floating-Point Rateamg: minife: Smallstockfish: Total Timecompress-7zip: Compression Ratingcompress-7zip: Decompression Ratingbuild-llvm: Ninjabuild-llvm: Unix Makefilesbuild-linux-kernel: defconfigbuild-linux-kernel: allmodconfigspeedb: Seq Fillspeedb: Rand Fillspeedb: Rand Fill Syncspeedb: Rand Readspeedb: Read While Writingspeedb: Read Rand Write Randspeedb: Update Randopenssl: RSA4096openssl: RSA4096openssl: SHA256openssl: SHA512openssl: AES-128-GCMopenssl: AES-256-GCMopenssl: ChaCha20openssl: ChaCha20-Poly1305rocksdb: Rand Readrocksdb: Read While Writingrocksdb: Read Rand Write Randrocksdb: Update RandG242-P3633761.08252315.2627153.742020.1862783286.48398869.87681885.3052250.5321143237.72167637763.5928009.0720365273.287.29879814.35164364343.39574.851088.771419.061624492.92604943.76343012.7537172432.6672283.187330369.9615671801.48113551.875987.8822213.54151220570.5130330081.185099.81299.50102535.3586218.952346519.637795.964690386.6436794.3311438.27651638239.97073045034.9761564201.71935.262481137.7812527333355.5703339.976516203333185.3571477.8141132.103233.62291834.5799200.0280314.145225543333202.2279310.840333.74731830.576045.67881358.1773476.3781132.30472677.070823.500447.02501320.1354430.1375146.74621.910.680.301.830.6721.5813.903.074.58817.784983105706433323996.0188653177333316537647266.333411.52178.703308.2972950792849872073764095716251290503524196832722756342.8517886.01013229617533447876959038268820730030648784268016173222607011221344884043405235585588453320337431406OpenBenchmarking.org

Stress-NG

Stress-NG is a Linux stress tool developed by Colin Ian King. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: CPU StressG242-P367K14K21K28K35KSE +/- 1.60, N = 333761.081. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: CryptoG242-P3650K100K150K200K250KSE +/- 928.63, N = 3252315.261. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Memory CopyingG242-P366K12K18K24K30KSE +/- 1.16, N = 327153.741. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Glibc Qsort Data SortingG242-P36400800120016002000SE +/- 0.78, N = 32020.181. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Glibc C String FunctionsG242-P3613M26M39M52M65MSE +/- 17918.08, N = 362783286.481. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Vector MathG242-P3690K180K270K360K450KSE +/- 4.53, N = 3398869.871. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Matrix MathG242-P36150K300K450K600K750KSE +/- 404.39, N = 3681885.301. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: ForkingG242-P3611K22K33K44K55KSE +/- 410.62, N = 352250.531. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: System V Message PassingG242-P365M10M15M20M25MSE +/- 32907.24, N = 321143237.721. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: SemaphoresG242-P3640M80M120M160M200MSE +/- 217685.76, N = 3167637763.591. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Socket ActivityG242-P366K12K18K24K30KSE +/- 159.43, N = 328009.071. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Context SwitchingG242-P364M8M12M16M20MSE +/- 174052.70, N = 1520365273.281. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: AtomicG242-P36246810SE +/- 0.59, N = 157.291. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: CPU CacheG242-P36200K400K600K800K1000KSE +/- 1033.74, N = 3879814.351. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: MallocG242-P3640M80M120M160M200MSE +/- 296218.44, N = 3164364343.391. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: MEMFDG242-P36120240360480600SE +/- 4.82, N = 8574.851. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: MMAPG242-P362004006008001000SE +/- 5.43, N = 31088.771. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: NUMAG242-P3630060090012001500SE +/- 2.47, N = 31419.061. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: SENDFILEG242-P36300K600K900K1200K1500KSE +/- 18.53, N = 31624492.921. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: IO_uringG242-P36130K260K390K520K650KSE +/- 5192.48, N = 3604943.761. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: FutexG242-P3670K140K210K280K350KSE +/- 7072.24, N = 15343012.751. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: MutexG242-P368M16M24M32M40MSE +/- 9463.26, N = 337172432.661. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Function CallG242-P3615K30K45K60K75KSE +/- 1.53, N = 372283.181. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: PollG242-P361.6M3.2M4.8M6.4M8MSE +/- 12697.25, N = 37330369.961. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: HashG242-P363M6M9M12M15MSE +/- 9429.94, N = 315671801.481. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: PthreadG242-P3620K40K60K80K100KSE +/- 65.20, N = 3113551.871. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: ZlibG242-P3613002600390052006500SE +/- 0.87, N = 35987.881. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Floating PointG242-P365K10K15K20K25KSE +/- 0.42, N = 322213.541. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Fused Multiply-AddG242-P3630M60M90M120M150MSE +/- 110268.18, N = 3151220570.511. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: PipeG242-P366M12M18M24M30MSE +/- 95784.06, N = 330330081.181. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Matrix 3D MathG242-P3611002200330044005500SE +/- 3.74, N = 35099.811. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: AVL TreeG242-P3670140210280350SE +/- 0.16, N = 3299.501. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Vector Floating PointG242-P3620K40K60K80K100KSE +/- 25.89, N = 3102535.351. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Vector ShuffleG242-P3620K40K60K80K100KSE +/- 3.20, N = 386218.951. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Wide Vector MathG242-P36500K1000K1500K2000K2500KSE +/- 6960.54, N = 32346519.631. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: CloningG242-P362K4K6K8K10KSE +/- 29.21, N = 37795.961. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: AVX-512 VNNIG242-P361000K2000K3000K4000K5000KSE +/- 401.84, N = 34690386.641. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Mixed SchedulerG242-P368K16K24K32K40KSE +/- 141.59, N = 336794.331. (CXX) g++ options: -O2 -std=gnu99 -lc

CacheBench

This is a performance test of CacheBench, which is part of LLCbench. CacheBench is designed to test the memory and cache bandwidth performance Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMB/s, More Is BetterCacheBenchTest: ReadG242-P362K4K6K8K10KSE +/- 0.01, N = 311438.28MIN: 11437.32 / MAX: 11438.591. (CC) gcc options: -O3 -lrt

OpenBenchmarking.orgMB/s, More Is BetterCacheBenchTest: WriteG242-P368K16K24K32K40KSE +/- 1.22, N = 338239.97MIN: 35288.52 / MAX: 413821. (CC) gcc options: -O3 -lrt

OpenBenchmarking.orgMB/s, More Is BetterCacheBenchTest: Read / Modify / WriteG242-P3610K20K30K40K50KSE +/- 2.04, N = 345034.98MIN: 43692.22 / MAX: 45639.261. (CC) gcc options: -O3 -lrt

Xmrig

Xmrig is an open-source cross-platform CPU/GPU miner for RandomX, KawPow, CryptoNight and AstroBWT. This test profile is setup to measure the Xmrig CPU mining performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterXmrig 6.21Variant: Monero - Hash Count: 1MG242-P369001800270036004500SE +/- 17.55, N = 34201.71. (CXX) g++ options: -fexceptions -fno-rtti -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc

OpenBenchmarking.orgH/s, More Is BetterXmrig 6.21Variant: Wownero - Hash Count: 1MG242-P36400800120016002000SE +/- 2.92, N = 31935.21. (CXX) g++ options: -fexceptions -fno-rtti -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc

LeelaChessZero

LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.30Backend: BLASG242-P361428425670SE +/- 0.58, N = 3621. (CXX) g++ options: -flto -pthread

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.30Backend: EigenG242-P361122334455SE +/- 0.33, N = 3481. (CXX) g++ options: -flto -pthread

Neural Magic DeepSparse

This is a benchmark of Neural Magic's DeepSparse using its built-in deepsparse.benchmark utility and various models from their SparseZoo (https://sparsezoo.neuralmagic.com/). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-StreamG242-P362004006008001000SE +/- 1.48, N = 31137.78

Quicksilver

Quicksilver is a proxy application that represents some elements of the Mercury workload by solving a simplified dynamic Monte Carlo particle transport problem. Quicksilver is developed by Lawrence Livermore National Laboratory (LLNL) and this test profile currently makes use of the OpenMP CPU threaded code path. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFigure Of Merit, More Is BetterQuicksilver 20230818Input: CORAL2 P1G242-P365M10M15M20M25MSE +/- 81103.50, N = 3252733331. (CXX) g++ options: -fopenmp -O3 -march=native

Neural Magic DeepSparse

This is a benchmark of Neural Magic's DeepSparse using its built-in deepsparse.benchmark utility and various models from their SparseZoo (https://sparsezoo.neuralmagic.com/). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-StreamG242-P361224364860SE +/- 0.09, N = 355.57

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-StreamG242-P3670140210280350SE +/- 0.19, N = 3339.98

Quicksilver

Quicksilver is a proxy application that represents some elements of the Mercury workload by solving a simplified dynamic Monte Carlo particle transport problem. Quicksilver is developed by Lawrence Livermore National Laboratory (LLNL) and this test profile currently makes use of the OpenMP CPU threaded code path. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFigure Of Merit, More Is BetterQuicksilver 20230818Input: CTS2G242-P363M6M9M12M15MSE +/- 42557.15, N = 3162033331. (CXX) g++ options: -fopenmp -O3 -march=native

Neural Magic DeepSparse

This is a benchmark of Neural Magic's DeepSparse using its built-in deepsparse.benchmark utility and various models from their SparseZoo (https://sparsezoo.neuralmagic.com/). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-StreamG242-P364080120160200SE +/- 0.10, N = 3185.36

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-StreamG242-P36100200300400500SE +/- 0.36, N = 3477.81

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-StreamG242-P36306090120150SE +/- 0.15, N = 3132.10

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-StreamG242-P36816243240SE +/- 0.04, N = 333.62

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-StreamG242-P36400800120016002000SE +/- 1.29, N = 31834.58

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-StreamG242-P364080120160200SE +/- 0.52, N = 3200.03

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-StreamG242-P3670140210280350SE +/- 0.64, N = 3314.15

Quicksilver

Quicksilver is a proxy application that represents some elements of the Mercury workload by solving a simplified dynamic Monte Carlo particle transport problem. Quicksilver is developed by Lawrence Livermore National Laboratory (LLNL) and this test profile currently makes use of the OpenMP CPU threaded code path. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFigure Of Merit, More Is BetterQuicksilver 20230818Input: CORAL2 P2G242-P365M10M15M20M25MSE +/- 84129.53, N = 3255433331. (CXX) g++ options: -fopenmp -O3 -march=native

Neural Magic DeepSparse

This is a benchmark of Neural Magic's DeepSparse using its built-in deepsparse.benchmark utility and various models from their SparseZoo (https://sparsezoo.neuralmagic.com/). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-StreamG242-P364080120160200SE +/- 0.53, N = 3202.23

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-StreamG242-P3670140210280350SE +/- 0.86, N = 3310.84

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-StreamG242-P36816243240SE +/- 0.08, N = 333.75

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-StreamG242-P36400800120016002000SE +/- 0.45, N = 31830.58

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-StreamG242-P361020304050SE +/- 0.32, N = 345.68

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-StreamG242-P3630060090012001500SE +/- 8.37, N = 31358.18

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-StreamG242-P36100200300400500SE +/- 0.42, N = 3476.38

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-StreamG242-P36306090120150SE +/- 0.18, N = 3132.30

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-StreamG242-P366001200180024003000SE +/- 25.24, N = 32677.07

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-StreamG242-P36612182430SE +/- 0.21, N = 323.50

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-StreamG242-P361122334455SE +/- 0.03, N = 347.03

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-StreamG242-P3630060090012001500SE +/- 0.85, N = 31320.14

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-StreamG242-P3690180270360450SE +/- 4.70, N = 3430.14

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-StreamG242-P36306090120150SE +/- 1.58, N = 3146.75

PyTorch

This is a benchmark of PyTorch making use of pytorch-benchmark [https://github.com/LukasHedegaard/pytorch-benchmark]. Currently this test profile is catered to CPU-based testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 1 - Model: ResNet-50G242-P360.42980.85961.28941.71922.149SE +/- 0.00, N = 31.91MIN: 1.8 / MAX: 2.09

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 1 - Model: ResNet-152G242-P360.1530.3060.4590.6120.765SE +/- 0.00, N = 30.68MIN: 0.65 / MAX: 0.7

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_lG242-P360.06750.1350.20250.270.3375SE +/- 0.00, N = 30.30MIN: 0.27 / MAX: 0.4

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 16 - Model: ResNet-50G242-P360.41180.82361.23541.64722.059SE +/- 0.02, N = 51.83MIN: 1.7 / MAX: 2.02

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 16 - Model: ResNet-152G242-P360.15080.30160.45240.60320.754SE +/- 0.00, N = 20.67MIN: 0.65 / MAX: 0.7

Llama.cpp

Llama.cpp is a port of Facebook's LLaMA model in C/C++ developed by Georgi Gerganov. Llama.cpp allows the inference of LLaMA and other supported models in C/C++. For CPU inference Llama.cpp supports AVX2/AVX-512, ARM NEON, and other modern ISAs along with features like OpenBLAS usage. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b1808Model: llama-2-7b.Q4_0.ggufG242-P36510152025SE +/- 0.21, N = 621.581. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -lopenblas

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b1808Model: llama-2-13b.Q4_0.ggufG242-P3648121620SE +/- 0.16, N = 1513.901. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -lopenblas

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b1808Model: llama-2-70b-chat.Q5_0.ggufG242-P360.69081.38162.07242.76323.454SE +/- 0.03, N = 83.071. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -lopenblas

GROMACS

The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2023Implementation: MPI CPU - Input: water_GMX50_bareG242-P361.03232.06463.09694.12925.1615SE +/- 0.002, N = 34.5881. (CXX) g++ options: -O3

ACES DGEMM

This is a multi-threaded DGEMM benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterACES DGEMM 1.0Sustained Floating-Point RateG242-P3648121620SE +/- 0.09, N = 417.781. (CC) gcc options: -O3 -march=native -fopenmp

Algebraic Multi-Grid Benchmark

AMG is a parallel algebraic multigrid solver for linear systems arising from problems on unstructured grids. The driver provided with AMG builds linear systems for various 3-dimensional problems. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFigure Of Merit, More Is BetterAlgebraic Multi-Grid Benchmark 1.2G242-P36200M400M600M800M1000MSE +/- 47484.50, N = 310570643331. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -lmpi

miniFE

MiniFE Finite Element is an application for unstructured implicit finite element codes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgCG Mflops, More Is BetterminiFE 2.2Problem Size: SmallG242-P365K10K15K20K25KSE +/- 14.30, N = 423996.01. (CXX) g++ options: -O3 -fopenmp -lmpi_cxx -lmpi

Stockfish

This is a test of Stockfish, an advanced open-source C++11 chess benchmark that can scale up to 512 CPU threads. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 15Total TimeG242-P3640M80M120M160M200MSE +/- 6857171.33, N = 151886531771. (CXX) g++ options: -lgcov -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -flto -flto=jobserver

7-Zip Compression

This is a test of 7-Zip compression/decompression with its integrated benchmark feature. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 22.01Test: Compression RatingG242-P3670K140K210K280K350KSE +/- 991.66, N = 33333161. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 22.01Test: Decompression RatingG242-P36120K240K360K480K600KSE +/- 396.38, N = 35376471. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

Timed LLVM Compilation

This test times how long it takes to compile/build the LLVM compiler stack. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 16.0Build System: NinjaG242-P3660120180240300SE +/- 0.67, N = 3266.33

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 16.0Build System: Unix MakefilesG242-P3690180270360450SE +/- 1.15, N = 3411.52

Timed Linux Kernel Compilation

This test times how long it takes to build the Linux kernel in a default configuration (defconfig) for the architecture being tested or alternatively an allmodconfig for building all possible kernel modules for the build. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 6.1Build: defconfigG242-P3620406080100SE +/- 0.82, N = 378.70

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 6.1Build: allmodconfigG242-P3670140210280350SE +/- 1.01, N = 3308.30

Speedb

Speedb is a next-generation key value storage engine that is RocksDB compatible and aiming for stability, efficiency, and performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Sequential FillG242-P3660K120K180K240K300KSE +/- 3101.60, N = 52950791. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Random FillG242-P3660K120K180K240K300KSE +/- 1985.22, N = 32849871. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Random Fill SyncG242-P3640K80K120K160K200KSE +/- 1986.97, N = 32073761. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Random ReadG242-P3690M180M270M360M450MSE +/- 2947408.87, N = 114095716251. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Read While WritingG242-P363M6M9M12M15MSE +/- 201662.23, N = 15129050351. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Read Random Write RandomG242-P36500K1000K1500K2000K2500KSE +/- 21596.32, N = 324196831. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Update RandomG242-P3660K120K180K240K300KSE +/- 1573.56, N = 32722751. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

OpenSSL

OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgsign/s, More Is BetterOpenSSL 3.1Algorithm: RSA4096G242-P3614002800420056007000SE +/- 0.10, N = 36342.81. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgverify/s, More Is BetterOpenSSL 3.1Algorithm: RSA4096G242-P36110K220K330K440K550KSE +/- 27.21, N = 3517886.01. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: SHA256G242-P3620000M40000M60000M80000M100000MSE +/- 64411674.99, N = 31013229617531. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: SHA512G242-P367000M14000M21000M28000M35000MSE +/- 8688088.34, N = 3344787695901. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: AES-128-GCMG242-P3680000M160000M240000M320000M400000MSE +/- 3586455.40, N = 33826882073001. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: AES-256-GCMG242-P3670000M140000M210000M280000M350000MSE +/- 40660594.45, N = 33064878426801. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: ChaCha20G242-P3630000M60000M90000M120000M150000MSE +/- 10001054.79, N = 31617322260701. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: ChaCha20-Poly1305G242-P3620000M40000M60000M80000M100000MSE +/- 361309.16, N = 31122134488401. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

RocksDB

This is a benchmark of Meta/Facebook's RocksDB as an embeddable persistent key-value store for fast storage based on Google's LevelDB. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgOp/s, More Is BetterRocksDB 8.0Test: Random ReadG242-P3690M180M270M360M450MSE +/- 4162622.50, N = 154340523551. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

OpenBenchmarking.orgOp/s, More Is BetterRocksDB 8.0Test: Read While WritingG242-P362M4M6M8M10MSE +/- 68677.29, N = 985588451. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

OpenBenchmarking.orgOp/s, More Is BetterRocksDB 8.0Test: Read Random Write RandomG242-P36700K1400K2100K2800K3500KSE +/- 30568.75, N = 733203371. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

OpenBenchmarking.orgOp/s, More Is BetterRocksDB 8.0Test: Update RandomG242-P3690K180K270K360K450KSE +/- 4409.44, N = 34314061. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

CPU Peak Freq (Highest CPU Core Frequency) Monitor

OpenBenchmarking.orgMegahertzCPU Peak Freq (Highest CPU Core Frequency) MonitorPhoronix Test Suite System MonitoringG242-P3660012001800240030003000

CPU Power Consumption Monitor

OpenBenchmarking.orgWattsCPU Power Consumption MonitorPhoronix Test Suite System MonitoringG242-P3650100150200250Min: 21.48 / Avg: 119.26 / Max: 284.79

CPU Temperature Monitor

OpenBenchmarking.orgCelsiusCPU Temperature MonitorPhoronix Test Suite System MonitoringG242-P361632486480Min: 35 / Avg: 61.54 / Max: 84

System Power Consumption Monitor

OpenBenchmarking.orgWattsSystem Power Consumption MonitorPhoronix Test Suite System MonitoringG242-P36120240360480600Min: 121 / Avg: 261.83 / Max: 673

114 Results Shown

Stress-NG:
  CPU Stress
  Crypto
  Memory Copying
  Glibc Qsort Data Sorting
  Glibc C String Functions
  Vector Math
  Matrix Math
  Forking
  System V Message Passing
  Semaphores
  Socket Activity
  Context Switching
  Atomic
  CPU Cache
  Malloc
  MEMFD
  MMAP
  NUMA
  SENDFILE
  IO_uring
  Futex
  Mutex
  Function Call
  Poll
  Hash
  Pthread
  Zlib
  Floating Point
  Fused Multiply-Add
  Pipe
  Matrix 3D Math
  AVL Tree
  Vector Floating Point
  Vector Shuffle
  Wide Vector Math
  Cloning
  AVX-512 VNNI
  Mixed Scheduler
CacheBench:
  Read
  Write
  Read / Modify / Write
Xmrig:
  Monero - 1M
  Wownero - 1M
LeelaChessZero:
  BLAS
  Eigen
Neural Magic DeepSparse
Quicksilver
Neural Magic DeepSparse:
  NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream
  NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream
Quicksilver
Neural Magic DeepSparse:
  NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream
  CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream
  CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream
  NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream
  NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream
  CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream
  CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream
Quicksilver
Neural Magic DeepSparse:
  CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream:
    items/sec
    ms/batch
  NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream:
    items/sec
    ms/batch
  CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream:
    items/sec
    ms/batch
  ResNet-50, Baseline - Asynchronous Multi-Stream:
    items/sec
    ms/batch
  ResNet-50, Sparse INT8 - Asynchronous Multi-Stream:
    items/sec
    ms/batch
  BERT-Large, NLP Question Answering - Asynchronous Multi-Stream:
    items/sec
    ms/batch
  BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream:
    items/sec
    ms/batch
PyTorch:
  CPU - 1 - ResNet-50
  CPU - 1 - ResNet-152
  CPU - 1 - Efficientnet_v2_l
  CPU - 16 - ResNet-50
  CPU - 16 - ResNet-152
Llama.cpp:
  llama-2-7b.Q4_0.gguf
  llama-2-13b.Q4_0.gguf
  llama-2-70b-chat.Q5_0.gguf
GROMACS
ACES DGEMM
Algebraic Multi-Grid Benchmark
miniFE
Stockfish
7-Zip Compression:
  Compression Rating
  Decompression Rating
Timed LLVM Compilation:
  Ninja
  Unix Makefiles
Timed Linux Kernel Compilation:
  defconfig
  allmodconfig
Speedb:
  Seq Fill
  Rand Fill
  Rand Fill Sync
  Rand Read
  Read While Writing
  Read Rand Write Rand
  Update Rand
OpenSSL:
  RSA4096:
    sign/s
    verify/s
  SHA256:
    byte/s
  SHA512:
    byte/s
  AES-128-GCM:
    byte/s
  AES-256-GCM:
    byte/s
  ChaCha20:
    byte/s
  ChaCha20-Poly1305:
    byte/s
RocksDB:
  Rand Read
  Read While Writing
  Read Rand Write Rand
  Update Rand
CPU Peak Freq (Highest CPU Core Frequency) Monitor:
  Phoronix Test Suite System Monitoring:
    Megahertz
    Watts
    Celsius
    Watts