new amp

ARMv8 Neoverse-N1 testing with a GIGABYTE G242-P36-00 MP32-AR2-00 v01000100 (F31k SCP: 2.10.20220531 BIOS) and ASPEED on Ubuntu 23.10 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2402068-NE-NEWAMP18865
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results

Limit displaying results to tests within:

HPC - High Performance Computing 2 Tests
Machine Learning 2 Tests

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Geometric Means Per-Suite/Category
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
a
February 06
  52 Minutes
b
February 06
  53 Minutes
c
February 06
  51 Minutes
Invert Hiding All Results Option
  52 Minutes

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


new ampOpenBenchmarking.orgPhoronix Test SuiteARMv8 Neoverse-N1 @ 3.00GHz (128 Cores)GIGABYTE G242-P36-00 MP32-AR2-00 v01000100 (F31k SCPAmpere Computing LLC Altra PCI Root Complex A16 x 32GB DDR4-3200MT/s Samsung M393A4K40DB3-CWE800GB Micron_7450_MTFDKBA800TFSASPEEDVGA HDMI2 x Intel I350Ubuntu 23.106.5.0-13-generic (aarch64)GCC 13.2.0ext41920x1080ProcessorMotherboardChipsetMemoryDiskGraphicsMonitorNetworkOSKernelCompilerFile-SystemScreen ResolutionNew Amp BenchmarksSystem Logs- Transparent Huge Pages: madvise- --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v - Scaling Governor: cppc_cpufreq performance (Boost: Disabled)- Python 3.11.6- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected

abcResult OverviewPhoronix Test Suite100%102%103%105%107%LlamafileONNX RuntimeLZ4 Compression

new ampcompress-lz4: 1 - Compression Speedcompress-lz4: 1 - Decompression Speedcompress-lz4: 3 - Compression Speedcompress-lz4: 3 - Decompression Speedcompress-lz4: 9 - Compression Speedcompress-lz4: 9 - Decompression Speedonnx: GPT-2 - CPU - Parallelonnx: GPT-2 - CPU - Parallelonnx: GPT-2 - CPU - Standardonnx: GPT-2 - CPU - Standardonnx: yolov4 - CPU - Parallelonnx: yolov4 - CPU - Parallelonnx: yolov4 - CPU - Standardonnx: yolov4 - CPU - Standardonnx: T5 Encoder - CPU - Parallelonnx: T5 Encoder - CPU - Parallelonnx: T5 Encoder - CPU - Standardonnx: T5 Encoder - CPU - Standardonnx: bertsquad-12 - CPU - Parallelonnx: bertsquad-12 - CPU - Parallelonnx: bertsquad-12 - CPU - Standardonnx: bertsquad-12 - CPU - Standardonnx: CaffeNet 12-int8 - CPU - Parallelonnx: CaffeNet 12-int8 - CPU - Parallelonnx: CaffeNet 12-int8 - CPU - Standardonnx: CaffeNet 12-int8 - CPU - Standardonnx: fcn-resnet101-11 - CPU - Parallelonnx: fcn-resnet101-11 - CPU - Parallelonnx: fcn-resnet101-11 - CPU - Standardonnx: fcn-resnet101-11 - CPU - Standardonnx: ArcFace ResNet-100 - CPU - Parallelonnx: ArcFace ResNet-100 - CPU - Parallelonnx: ArcFace ResNet-100 - CPU - Standardonnx: ArcFace ResNet-100 - CPU - Standardonnx: ResNet50 v1-12-int8 - CPU - Parallelonnx: ResNet50 v1-12-int8 - CPU - Parallelonnx: ResNet50 v1-12-int8 - CPU - Standardonnx: ResNet50 v1-12-int8 - CPU - Standardonnx: super-resolution-10 - CPU - Parallelonnx: super-resolution-10 - CPU - Parallelonnx: super-resolution-10 - CPU - Standardonnx: super-resolution-10 - CPU - Standardonnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallelonnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallelonnx: Faster R-CNN R-50-FPN-int8 - CPU - Standardonnx: Faster R-CNN R-50-FPN-int8 - CPU - Standardllamafile: llava-v1.5-7b-q4 - CPUllamafile: mistral-7b-instruct-v0.2.Q8_0 - CPUllamafile: wizardcoder-python-34b-v1.0.Q6_K - CPUabc519.832815.280.972492.227.592511.8154.2936.47235178.7365.585256.09066164.1817.13777140.095250.5563.98962258.6373.8622710.927791.506722.172445.0965576.5931.73248701.3711.423431.12538888.5841.20414830.4669.81261101.90711.002590.885131.4887.60357170.1215.8753375.714213.206279.494412.57624.859940.222625.364139.42063.313.151.78520.412827.780.952493.127.682511154.8996.44697176.5235.655116.16283162.267.11377140.568251.2523.97869258.8553.859211.754585.069922.076945.2911566.7251.76282698.3431.429551.14758871.3951.24444803.5719.82943101.73310.748493.03321327.57392167.7365.9582375.667213.214479.516612.572324.839840.255225.068539.88553.022.891.74521.152841.880.992491.627.642512154.7036.45507177.4395.625856.20055161.2727.12556140.335251.4573.9752253.5973.9391811.109190.012921.999845.4499576.2271.73356700.4821.425321.131228841.25872794.4569.80991101.93610.985491.0258130.7057.64929170.6325.8570675.640113.219179.485112.577424.861240.220225.455439.27893.312.831.77OpenBenchmarking.org

LZ4 Compression

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.4Compression Level: 1 - Compression Speedabc110220330440550519.83520.41521.151. (CC) gcc options: -O3

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.4Compression Level: 1 - Decompression Speedabc60012001800240030002815.22827.72841.81. (CC) gcc options: -O3

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.4Compression Level: 3 - Compression Speedabc2040608010080.9780.9580.991. (CC) gcc options: -O3

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.4Compression Level: 3 - Decompression Speedabc50010001500200025002492.22493.12491.61. (CC) gcc options: -O3

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.4Compression Level: 9 - Compression Speedabc71421283527.5927.6827.641. (CC) gcc options: -O3

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.4Compression Level: 9 - Decompression Speedabc50010001500200025002511.82511.02512.01. (CC) gcc options: -O3

ONNX Runtime

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: GPT-2 - Device: CPU - Executor: Parallelabc306090120150154.29154.90154.701. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: GPT-2 - Device: CPU - Executor: Standardabc4080120160200178.74176.52177.441. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: yolov4 - Device: CPU - Executor: Parallelabc2468106.090666.162836.200551. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: yolov4 - Device: CPU - Executor: Standardabc2468107.137777.113777.125561. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: T5 Encoder - Device: CPU - Executor: Parallelabc50100150200250250.56251.25251.461. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: T5 Encoder - Device: CPU - Executor: Standardabc60120180240300258.64258.86253.601. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: bertsquad-12 - Device: CPU - Executor: Parallelabc369121510.9311.7511.111. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: bertsquad-12 - Device: CPU - Executor: Standardabc51015202522.1722.0822.001. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallelabc120240360480600576.59566.73576.231. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: CaffeNet 12-int8 - Device: CPU - Executor: Standardabc150300450600750701.37698.34700.481. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: fcn-resnet101-11 - Device: CPU - Executor: Parallelabc0.25820.51640.77461.03281.2911.125381.147581.131221. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: fcn-resnet101-11 - Device: CPU - Executor: Standardabc0.28320.56640.84961.13281.4161.204141.244441.258721. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallelabc36912159.812619.829439.809911. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: ArcFace ResNet-100 - Device: CPU - Executor: Standardabc369121511.0010.7510.991. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallelabc306090120150131.49132.00130.711. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standardabc4080120160200170.12167.74170.631. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: super-resolution-10 - Device: CPU - Executor: Parallelabc2040608010075.7175.6775.641. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: super-resolution-10 - Device: CPU - Executor: Standardabc2040608010079.4979.5279.491. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallelabc61218243024.8624.8424.861. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standardabc61218243025.3625.0725.461. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

Llamafile

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.6Test: llava-v1.5-7b-q4 - Acceleration: CPUabc0.74481.48962.23442.97923.7243.313.023.31

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.6Test: mistral-7b-instruct-v0.2.Q8_0 - Acceleration: CPUabc0.70881.41762.12642.83523.5443.152.892.83

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.6Test: wizardcoder-python-34b-v1.0.Q6_K - Acceleration: CPUabc0.40050.8011.20151.6022.00251.781.741.77