Benchmarks by Michael Larabel for a future article.
G242-P36 Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -vProcessor Notes: Scaling Governor: cppc_cpufreq performance (Boost: Disabled)Python Notes: Python 3.11.6Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected
gig dd Processor: ARMv8 Neoverse-N1 @ 3.00GHz (128 Cores), Motherboard: GIGABYTE G242-P36-00 MP32-AR2-00 v01000100 (F31k SCP: 2.10.20220531 BIOS), Chipset: Ampere Computing LLC Altra PCI Root Complex A, Memory: 16 x 32 GB DDR4-3200MT/s Samsung M393A4K40DB3-CWE, Disk: 800GB Micron_7450_MTFDKBA800TFS, Graphics: ASPEED, Monitor: VGA HDMI, Network: 2 x Intel I350
OS: Ubuntu 23.10, Kernel: 6.5.0-13-generic (aarch64), Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1920x1080
Gigabyte G242-P36 Ampere Altra Max Server OpenBenchmarking.org Phoronix Test Suite ARMv8 Neoverse-N1 @ 3.00GHz (128 Cores) GIGABYTE G242-P36-00 MP32-AR2-00 v01000100 (F31k SCP Ampere Computing LLC Altra PCI Root Complex A 16 x 32 GB DDR4-3200MT/s Samsung M393A4K40DB3-CWE 800GB Micron_7450_MTFDKBA800TFS ASPEED VGA HDMI 2 x Intel I350 Ubuntu 23.10 6.5.0-13-generic (aarch64) GCC 13.2.0 ext4 1920x1080 Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Compiler File-System Screen Resolution Gigabyte G242-P36 Ampere Altra Max Server Benchmarks System Logs - Transparent Huge Pages: madvise - --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v - Scaling Governor: cppc_cpufreq performance (Boost: Disabled) - Python 3.11.6 - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected
G242-P36 gig dd Result Overview Phoronix Test Suite 100% 107% 114% 121% Stockfish Llama.cpp LeelaChessZero Quicksilver RocksDB Timed Linux Kernel Compilation Stress-NG Timed LLVM Compilation Speedb Neural Magic DeepSparse 7-Zip Compression OpenSSL CacheBench
Gigabyte G242-P36 Ampere Altra Max Server pytorch: CPU - 1 - ResNet-50 pytorch: CPU - 1 - ResNet-152 pytorch: CPU - 1 - Efficientnet_v2_l pytorch: CPU - 16 - ResNet-50 pytorch: CPU - 16 - ResNet-152 stress-ng: CPU Stress stress-ng: Crypto stress-ng: Memory Copying stress-ng: Glibc Qsort Data Sorting stress-ng: Glibc C String Functions stress-ng: Vector Math stress-ng: Matrix Math stress-ng: Forking stress-ng: System V Message Passing stress-ng: Semaphores stress-ng: Socket Activity stress-ng: Context Switching stress-ng: Atomic stress-ng: CPU Cache stress-ng: Malloc stress-ng: MEMFD stress-ng: MMAP stress-ng: NUMA stress-ng: SENDFILE stress-ng: IO_uring stress-ng: Futex stress-ng: Mutex stress-ng: Function Call stress-ng: Poll stress-ng: Hash stress-ng: Pthread stress-ng: Zlib stress-ng: Floating Point stress-ng: Fused Multiply-Add stress-ng: Pipe stress-ng: Matrix 3D Math stress-ng: AVL Tree stress-ng: Vector Floating Point stress-ng: Vector Shuffle stress-ng: Wide Vector Math stress-ng: Cloning stress-ng: AVX-512 VNNI stress-ng: Mixed Scheduler openssl: SHA256 openssl: SHA512 openssl: AES-128-GCM openssl: AES-256-GCM openssl: ChaCha20 openssl: ChaCha20-Poly1305 minife: Small quicksilver: CORAL2 P1 quicksilver: CORAL2 P2 quicksilver: CTS2 amg: mt-dgemm: Sustained Floating-Point Rate xmrig: Monero - 1M xmrig: Wownero - 1M deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream cachebench: Read cachebench: Write cachebench: Read / Modify / Write compress-7zip: Compression Rating compress-7zip: Decompression Rating lczero: BLAS lczero: Eigen stockfish: Total Time gromacs: MPI CPU - water_GMX50_bare speedb: Seq Fill speedb: Rand Fill speedb: Rand Fill Sync speedb: Rand Read speedb: Read While Writing speedb: Read Rand Write Rand speedb: Update Rand rocksdb: Rand Read rocksdb: Read While Writing rocksdb: Read Rand Write Rand rocksdb: Update Rand openssl: RSA4096 llama-cpp: llama-2-7b.Q4_0.gguf llama-cpp: llama-2-13b.Q4_0.gguf llama-cpp: llama-2-70b-chat.Q5_0.gguf openssl: RSA4096 deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream build-linux-kernel: defconfig build-linux-kernel: allmodconfig build-llvm: Ninja build-llvm: Unix Makefiles G242-P36 gig dd 1.91 0.68 0.30 1.83 0.67 33761.08 252315.26 27153.74 2020.18 62783286.48 398869.87 681885.30 52250.53 21143237.72 167637763.59 28009.07 20365273.28 7.29 879814.35 164364343.39 574.85 1088.77 1419.06 1624492.92 604943.76 343012.75 37172432.66 72283.18 7330369.96 15671801.48 113551.87 5987.88 22213.54 151220570.51 30330081.18 5099.81 299.50 102535.35 86218.95 2346519.63 7795.96 4690386.64 36794.33 101322961753 34478769590 382688207300 306487842680 161732226070 112213448840 23996.0 25273333 25543333 16203333 1057064333 17.784983 4201.7 1935.2 1137.781 339.9765 477.8141 33.6229 200.0280 202.2279 33.7473 45.6788 476.3781 2677.0708 47.0250 430.1375 11438.276516 38239.970730 45034.976156 333316 537647 62 48 188653177 4.588 295079 284987 207376 409571625 12905035 2419683 272275 434052355 8558845 3320337 431406 6342.8 21.58 13.90 3.07 517886.0 55.5703 185.3571 132.1032 1834.5799 314.1452 310.8403 1830.5760 1358.1773 132.3047 23.5004 1320.1354 146.7462 78.703 308.297 266.333 411.521 33765.26 251986.12 27162.14 2022.01 62867317.16 398993.46 682490.75 50130.97 21054213.79 167850957.68 27959.85 19654874.85 5.64 882510.28 164067515.18 576.53 1104.19 1416.03 1624969.46 612149.93 323012.96 37215286.04 72298.23 7392099.82 15654462.92 112993.15 5993.74 22219.8 151387869.76 29805509.12 5082.65 299.1 102553.11 86375.79 2355564.94 7312.78 4691697.85 36309.29 100039593750 34453399030 382856328260 306544534870 161791663040 112250396400 24150.7 25810000 25520000 16460000 1060136000 18.27275 1141.451 339.5239 472.0699 33.5823 198.9064 202.6332 33.869 46.7273 477.0964 2624.7719 46.5531 421.345 11438.666161 38251.591924 45027.472701 333057 541204 59 47 177653916 4.688 290059 278264 204410 418448304 13255341 2518519 264998 450500912 8516060 3449038 427908 6345.6 21.9 14.02 3.13 518115.9 55.4233 185.8675 133.4484 1832.1154 315.8962 310.6422 1830.7174 1334.5433 132.3228 24.0125 1326.9581 149.4774 80.078 309.477 267.86 408.271 33559.87 251996.36 27159.07 2020.3 62845443.53 399042.09 682554.33 50686.58 21119614.31 166379337.67 27536.79 20708288.98 6.8 882225.34 164592319.96 569.36 1092.25 1426.45 1624702.09 583751.83 318037.93 37267646.91 72290.81 7395099.64 15654282.58 113379.28 5985.69 22220.7 151037296.46 30776841.73 5089.19 299.99 102604.74 86257.77 2354926.97 6918.49 4692452.8 36361.29 101321237450 34448701700 382793028680 25510000 24460000 16430000 1135.4365 343.7639 483.308 33.2422 198.3147 201.7554 33.1592 46.4998 477.6899 2684.8341 46.4194 433.8593 11438.863847 38252.62844 45041.154853 331579 541552 60 48 226859548 285766 285316 207891 420437471 13785530 2473336 264748 404291813 8636563 3537322 443804 6345.3 26.64 14.11 3.14 518085.7 55.7222 183.6437 130.6575 1850.2264 316.9118 311.5325 1843.2396 1336.3924 132.2627 23.4209 1327.9962 145.2837 80.243 310.137 264.744 407.19 OpenBenchmarking.org
PyTorch This is a benchmark of PyTorch making use of pytorch-benchmark [https://github.com/LukasHedegaard/pytorch-benchmark]. Currently this test profile is catered to CPU-based testing. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 1 - Model: ResNet-50 G242-P36 0.4298 0.8596 1.2894 1.7192 2.149 SE +/- 0.00, N = 3 1.91 MIN: 1.8 / MAX: 2.09
OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 1 - Model: ResNet-152 G242-P36 0.153 0.306 0.459 0.612 0.765 SE +/- 0.00, N = 3 0.68 MIN: 0.65 / MAX: 0.7
OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l G242-P36 0.0675 0.135 0.2025 0.27 0.3375 SE +/- 0.00, N = 3 0.30 MIN: 0.27 / MAX: 0.4
OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 16 - Model: ResNet-50 G242-P36 0.4118 0.8236 1.2354 1.6472 2.059 SE +/- 0.02, N = 5 1.83 MIN: 1.7 / MAX: 2.02
OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 16 - Model: ResNet-152 G242-P36 0.1508 0.3016 0.4524 0.6032 0.754 SE +/- 0.00, N = 2 0.67 MIN: 0.65 / MAX: 0.7
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Crypto gig dd G242-P36 50K 100K 150K 200K 250K SE +/- 928.63, N = 3 251986.12 251996.36 252315.26 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Memory Copying gig dd G242-P36 6K 12K 18K 24K 30K SE +/- 1.16, N = 3 27162.14 27159.07 27153.74 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Glibc Qsort Data Sorting gig dd G242-P36 400 800 1200 1600 2000 SE +/- 0.78, N = 3 2022.01 2020.30 2020.18 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Glibc C String Functions gig dd G242-P36 13M 26M 39M 52M 65M SE +/- 17918.08, N = 3 62867317.16 62845443.53 62783286.48 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Vector Math gig dd G242-P36 90K 180K 270K 360K 450K SE +/- 4.53, N = 3 398993.46 399042.09 398869.87 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Matrix Math gig dd G242-P36 150K 300K 450K 600K 750K SE +/- 404.39, N = 3 682490.75 682554.33 681885.30 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Forking gig dd G242-P36 11K 22K 33K 44K 55K SE +/- 410.62, N = 3 50130.97 50686.58 52250.53 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: System V Message Passing gig dd G242-P36 5M 10M 15M 20M 25M SE +/- 32907.24, N = 3 21054213.79 21119614.31 21143237.72 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Semaphores gig dd G242-P36 40M 80M 120M 160M 200M SE +/- 217685.76, N = 3 167850957.68 166379337.67 167637763.59 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Socket Activity gig dd G242-P36 6K 12K 18K 24K 30K SE +/- 159.43, N = 3 27959.85 27536.79 28009.07 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Context Switching gig dd G242-P36 4M 8M 12M 16M 20M SE +/- 174052.70, N = 15 19654874.85 20708288.98 20365273.28 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: CPU Cache gig dd G242-P36 200K 400K 600K 800K 1000K SE +/- 1033.74, N = 3 882510.28 882225.34 879814.35 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Malloc gig dd G242-P36 40M 80M 120M 160M 200M SE +/- 296218.44, N = 3 164067515.18 164592319.96 164364343.39 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: SENDFILE gig dd G242-P36 300K 600K 900K 1200K 1500K SE +/- 18.53, N = 3 1624969.46 1624702.09 1624492.92 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: IO_uring gig dd G242-P36 130K 260K 390K 520K 650K SE +/- 5192.48, N = 3 612149.93 583751.83 604943.76 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Futex gig dd G242-P36 70K 140K 210K 280K 350K SE +/- 7072.24, N = 15 323012.96 318037.93 343012.75 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Mutex gig dd G242-P36 8M 16M 24M 32M 40M SE +/- 9463.26, N = 3 37215286.04 37267646.91 37172432.66 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Function Call gig dd G242-P36 15K 30K 45K 60K 75K SE +/- 1.53, N = 3 72298.23 72290.81 72283.18 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Poll gig dd G242-P36 1.6M 3.2M 4.8M 6.4M 8M SE +/- 12697.25, N = 3 7392099.82 7395099.64 7330369.96 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Hash gig dd G242-P36 3M 6M 9M 12M 15M SE +/- 9429.94, N = 3 15654462.92 15654282.58 15671801.48 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Pthread gig dd G242-P36 20K 40K 60K 80K 100K SE +/- 65.20, N = 3 112993.15 113379.28 113551.87 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Zlib gig dd G242-P36 1300 2600 3900 5200 6500 SE +/- 0.87, N = 3 5993.74 5985.69 5987.88 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Floating Point gig dd G242-P36 5K 10K 15K 20K 25K SE +/- 0.42, N = 3 22219.80 22220.70 22213.54 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Fused Multiply-Add gig dd G242-P36 30M 60M 90M 120M 150M SE +/- 110268.18, N = 3 151387869.76 151037296.46 151220570.51 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Pipe gig dd G242-P36 7M 14M 21M 28M 35M SE +/- 95784.06, N = 3 29805509.12 30776841.73 30330081.18 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Matrix 3D Math gig dd G242-P36 1100 2200 3300 4400 5500 SE +/- 3.74, N = 3 5082.65 5089.19 5099.81 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Vector Floating Point gig dd G242-P36 20K 40K 60K 80K 100K SE +/- 25.89, N = 3 102553.11 102604.74 102535.35 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Vector Shuffle gig dd G242-P36 20K 40K 60K 80K 100K SE +/- 3.20, N = 3 86375.79 86257.77 86218.95 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Wide Vector Math gig dd G242-P36 500K 1000K 1500K 2000K 2500K SE +/- 6960.54, N = 3 2355564.94 2354926.97 2346519.63 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: AVX-512 VNNI gig dd G242-P36 1000K 2000K 3000K 4000K 5000K SE +/- 401.84, N = 3 4691697.85 4692452.80 4690386.64 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Mixed Scheduler gig dd G242-P36 8K 16K 24K 32K 40K SE +/- 141.59, N = 3 36309.29 36361.29 36794.33 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: SHA256 gig dd G242-P36 20000M 40000M 60000M 80000M 100000M SE +/- 64411674.99, N = 3 100039593750 101321237450 101322961753 1. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: SHA512 gig dd G242-P36 7000M 14000M 21000M 28000M 35000M SE +/- 8688088.34, N = 3 34453399030 34448701700 34478769590 1. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: AES-128-GCM gig dd G242-P36 80000M 160000M 240000M 320000M 400000M SE +/- 3586455.40, N = 3 382856328260 382793028680 382688207300 1. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: AES-256-GCM gig G242-P36 70000M 140000M 210000M 280000M 350000M SE +/- 40660594.45, N = 3 306544534870 306487842680 1. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: ChaCha20 gig G242-P36 30000M 60000M 90000M 120000M 150000M SE +/- 10001054.79, N = 3 161791663040 161732226070 1. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: ChaCha20-Poly1305 gig G242-P36 20000M 40000M 60000M 80000M 100000M SE +/- 361309.16, N = 3 112250396400 112213448840 1. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl
Quicksilver Quicksilver is a proxy application that represents some elements of the Mercury workload by solving a simplified dynamic Monte Carlo particle transport problem. Quicksilver is developed by Lawrence Livermore National Laboratory (LLNL) and this test profile currently makes use of the OpenMP CPU threaded code path. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Figure Of Merit, More Is Better Quicksilver 20230818 Input: CORAL2 P1 gig dd G242-P36 6M 12M 18M 24M 30M SE +/- 81103.50, N = 3 25810000 25510000 25273333 1. (CXX) g++ options: -fopenmp -O3 -march=native
OpenBenchmarking.org Figure Of Merit, More Is Better Quicksilver 20230818 Input: CORAL2 P2 gig dd G242-P36 5M 10M 15M 20M 25M SE +/- 84129.53, N = 3 25520000 24460000 25543333 1. (CXX) g++ options: -fopenmp -O3 -march=native
OpenBenchmarking.org Figure Of Merit, More Is Better Quicksilver 20230818 Input: CTS2 gig dd G242-P36 4M 8M 12M 16M 20M SE +/- 42557.15, N = 3 16460000 16430000 16203333 1. (CXX) g++ options: -fopenmp -O3 -march=native
Algebraic Multi-Grid Benchmark AMG is a parallel algebraic multigrid solver for linear systems arising from problems on unstructured grids. The driver provided with AMG builds linear systems for various 3-dimensional problems. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Figure Of Merit, More Is Better Algebraic Multi-Grid Benchmark 1.2 gig G242-P36 200M 400M 600M 800M 1000M SE +/- 47484.50, N = 3 1060136000 1057064333 1. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -lmpi
Xmrig Xmrig is an open-source cross-platform CPU/GPU miner for RandomX, KawPow, CryptoNight and AstroBWT. This test profile is setup to measure the Xmrig CPU mining performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org H/s, More Is Better Xmrig 6.21 Variant: Monero - Hash Count: 1M G242-P36 900 1800 2700 3600 4500 SE +/- 17.55, N = 3 4201.7 1. (CXX) g++ options: -fexceptions -fno-rtti -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
OpenBenchmarking.org H/s, More Is Better Xmrig 6.21 Variant: Wownero - Hash Count: 1M G242-P36 400 800 1200 1600 2000 SE +/- 2.92, N = 3 1935.2 1. (CXX) g++ options: -fexceptions -fno-rtti -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
Neural Magic DeepSparse This is a benchmark of Neural Magic's DeepSparse using its built-in deepsparse.benchmark utility and various models from their SparseZoo (https://sparsezoo.neuralmagic.com/). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream gig dd G242-P36 200 400 600 800 1000 SE +/- 1.48, N = 3 1141.45 1135.44 1137.78
CacheBench This is a performance test of CacheBench, which is part of LLCbench. CacheBench is designed to test the memory and cache bandwidth performance Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Read gig dd G242-P36 2K 4K 6K 8K 10K SE +/- 0.01, N = 3 11438.67 11438.86 11438.28 MIN: 11438.33 / MAX: 11438.85 MIN: 11438.05 / MAX: 11439.05 MIN: 11437.32 / MAX: 11438.59 1. (CC) gcc options: -O3 -lrt
OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Write gig dd G242-P36 8K 16K 24K 32K 40K SE +/- 1.22, N = 3 38251.59 38252.63 38239.97 MIN: 35289.91 / MAX: 41383.99 MIN: 35291.37 / MAX: 41384.3 MIN: 35288.52 / MAX: 41382 1. (CC) gcc options: -O3 -lrt
OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Read / Modify / Write gig dd G242-P36 10K 20K 30K 40K 50K SE +/- 2.04, N = 3 45027.47 45041.15 45034.98 MIN: 43694.36 / MAX: 45640.07 MIN: 43693.38 / MAX: 45647.65 MIN: 43692.22 / MAX: 45639.26 1. (CC) gcc options: -O3 -lrt
Stockfish This is a test of Stockfish, an advanced open-source C++11 chess benchmark that can scale up to 512 CPU threads. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 15 Total Time gig dd G242-P36 50M 100M 150M 200M 250M SE +/- 6857171.33, N = 15 177653916 226859548 188653177 1. (CXX) g++ options: -lgcov -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -flto -flto=jobserver
GROMACS The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2023 Implementation: MPI CPU - Input: water_GMX50_bare gig G242-P36 1.0548 2.1096 3.1644 4.2192 5.274 SE +/- 0.002, N = 3 4.688 4.588 1. (CXX) g++ options: -O3
Speedb Speedb is a next-generation key value storage engine that is RocksDB compatible and aiming for stability, efficiency, and performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Sequential Fill gig dd G242-P36 60K 120K 180K 240K 300K SE +/- 3101.60, N = 5 290059 285766 295079 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Random Fill gig dd G242-P36 60K 120K 180K 240K 300K SE +/- 1985.22, N = 3 278264 285316 284987 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Random Fill Sync gig dd G242-P36 40K 80K 120K 160K 200K SE +/- 1986.97, N = 3 204410 207891 207376 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Random Read gig dd G242-P36 90M 180M 270M 360M 450M SE +/- 2947408.87, N = 11 418448304 420437471 409571625 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Read While Writing gig dd G242-P36 3M 6M 9M 12M 15M SE +/- 201662.23, N = 15 13255341 13785530 12905035 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Read Random Write Random gig dd G242-P36 500K 1000K 1500K 2000K 2500K SE +/- 21596.32, N = 3 2518519 2473336 2419683 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Update Random gig dd G242-P36 60K 120K 180K 240K 300K SE +/- 1573.56, N = 3 264998 264748 272275 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB This is a benchmark of Meta/Facebook's RocksDB as an embeddable persistent key-value store for fast storage based on Google's LevelDB. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Random Read gig dd G242-P36 100M 200M 300M 400M 500M SE +/- 4162622.50, N = 15 450500912 404291813 434052355 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Read While Writing gig dd G242-P36 2M 4M 6M 8M 10M SE +/- 68677.29, N = 9 8516060 8636563 8558845 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Read Random Write Random gig dd G242-P36 800K 1600K 2400K 3200K 4000K SE +/- 30568.75, N = 7 3449038 3537322 3320337 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Update Random gig dd G242-P36 100K 200K 300K 400K 500K SE +/- 4409.44, N = 3 427908 443804 431406 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org sign/s, More Is Better OpenSSL 3.1 Algorithm: RSA4096 gig dd G242-P36 1400 2800 4200 5600 7000 SE +/- 0.10, N = 3 6345.6 6345.3 6342.8 1. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl
Llama.cpp Llama.cpp is a port of Facebook's LLaMA model in C/C++ developed by Georgi Gerganov. Llama.cpp allows the inference of LLaMA and other supported models in C/C++. For CPU inference Llama.cpp supports AVX2/AVX-512, ARM NEON, and other modern ISAs along with features like OpenBLAS usage. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b1808 Model: llama-2-7b.Q4_0.gguf gig dd G242-P36 6 12 18 24 30 SE +/- 0.21, N = 6 21.90 26.64 21.58 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -lopenblas
OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b1808 Model: llama-2-13b.Q4_0.gguf gig dd G242-P36 4 8 12 16 20 SE +/- 0.16, N = 15 14.02 14.11 13.90 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -lopenblas
OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b1808 Model: llama-2-70b-chat.Q5_0.gguf gig dd G242-P36 0.7065 1.413 2.1195 2.826 3.5325 SE +/- 0.03, N = 8 3.13 3.14 3.07 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -lopenblas
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org verify/s, More Is Better OpenSSL 3.1 Algorithm: RSA4096 gig dd G242-P36 110K 220K 330K 440K 550K SE +/- 27.21, N = 3 518115.9 518085.7 517886.0 1. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream gig dd G242-P36 400 800 1200 1600 2000 SE +/- 1.29, N = 3 1832.12 1850.23 1834.58
OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream gig dd G242-P36 400 800 1200 1600 2000 SE +/- 0.45, N = 3 1830.72 1843.24 1830.58
G242-P36 Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -vProcessor Notes: Scaling Governor: cppc_cpufreq performance (Boost: Disabled)Python Notes: Python 3.11.6Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 16 January 2024 23:01 by user phoronix.
gig Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -vProcessor Notes: Scaling Governor: cppc_cpufreq performance (Boost: Disabled)Python Notes: Python 3.11.6Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 17 January 2024 18:09 by user phoronix.
dd Processor: ARMv8 Neoverse-N1 @ 3.00GHz (128 Cores), Motherboard: GIGABYTE G242-P36-00 MP32-AR2-00 v01000100 (F31k SCP: 2.10.20220531 BIOS), Chipset: Ampere Computing LLC Altra PCI Root Complex A, Memory: 16 x 32 GB DDR4-3200MT/s Samsung M393A4K40DB3-CWE, Disk: 800GB Micron_7450_MTFDKBA800TFS, Graphics: ASPEED, Monitor: VGA HDMI, Network: 2 x Intel I350
OS: Ubuntu 23.10, Kernel: 6.5.0-13-generic (aarch64), Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -vProcessor Notes: Scaling Governor: cppc_cpufreq performance (Boost: Disabled)Python Notes: Python 3.11.6Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 17 January 2024 20:45 by user phoronix.