NVIDIA GH200 Compilers

Clang and GCC benchmarks by Michael Larabel for a future article. ARMv8 Neoverse-V2 testing with a Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS) and ASPEED on Ubuntu 23.10 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2402098-NE-NVIDIAGH291
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results
Show Result Confidence Charts
Allow Limiting Results To Certain Suite(s)

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Toggle/Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
GCC 13
February 09
  2 Hours, 27 Minutes
Clang 17
February 09
  2 Hours, 31 Minutes
Invert Behavior (Only Show Selected Data)
  2 Hours, 29 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


NVIDIA GH200 CompilersOpenBenchmarking.orgPhoronix Test SuiteARMv8 Neoverse-V2 @ 3.39GHz (72 Cores)Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS)1 x 480GB DRAM-6400MT/s960GB SAMSUNG MZ1L2960HCJR-00A07 + 1920GB SAMSUNG MZTL21T9ASPEED2 x Mellanox MT2910 + 2 x QLogic FastLinQ QL41000 10/25/40/50GbEUbuntu 23.106.8.0-060800rc3daily20240208-generic-64k (aarch64)GCC 13.2.0Clang 17.0.2ext41920x1200ProcessorMotherboardMemoryDiskGraphicsNetworkOSKernelCompilersFile-SystemScreen ResolutionNVIDIA GH200 Compilers PerformanceSystem Logs- Transparent Huge Pages: madvise- CXXFLAGS="-O3 -mtune=neoverse-v2 -mcpu=neoverse-v2" CFLAGS="-O3 -mtune=neoverse-v2 -mcpu=neoverse-v2" - GCC 13: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v - Scaling Governor: cppc_cpufreq performance (Boost: Disabled)- Python 3.11.6- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected

GCC 13 vs. Clang 17 ComparisonPhoronix Test SuiteBaseline+24.9%+24.9%+49.8%+49.8%+74.7%+74.7%+99.6%+99.6%99.7%69%51.6%51.4%51%48.6%37.9%37.8%37.5%36.1%30%30%26.9%26.9%20.7%16.2%10.5%8.2%8.2%7.9%7.7%7.5%7.5%7.4%7.2%7%6.9%5.1%4.9%3.2%2.7%2.3%2.1%SharpenV.F.P32 - 256 - 321 - 256 - 3264 - 256 - 3272 - 256 - 32Enhanced40.7%32 - 256 - 571 - 256 - 5764 - 256 - 5772 - 256 - 57Noise-Gaussian33.3%OpenMP - BM1OpenMP - BM1OpenMP - BM2OpenMP - BM214 digit23.8%Q.1.H.C19 - D.S20%218.3%19, Long Mode - D.S17.3%Vector MathWAV To MP314.9%Total Time - 4.1.R.P.P12.4%12%DefaultHWB Color Space8.5%Single-ThreadedA.C.PQuality 100Q.1.L.H.C64 - 256 - 512Multi-Threaded72 - 256 - 51232 - 256 - 512Matrix Math1 - 256 - 512WAV To Opus EncodeWAV To FLACRotate20k AtomsFused Multiply-Add2.7%Q.1.LSwirlGraphicsMagickStress-NGLiquid-DSPLiquid-DSPLiquid-DSPLiquid-DSPGraphicsMagickLiquid-DSPLiquid-DSPLiquid-DSPLiquid-DSPGraphicsMagickminiBUDEminiBUDEminiBUDEminiBUDEHelsingWebP Image EncodeZstd Compressionlibavif avifencZstd CompressionStress-NGLAME MP3 EncodingC-Raylibavif avifencWebP Image EncodeGraphicsMagickQuantLibTSCPWebP Image EncodeWebP Image EncodeLiquid-DSPQuantLibLiquid-DSPLiquid-DSPStress-NGLiquid-DSPOpus Codec EncodingFLAC Audio EncodingGraphicsMagickLAMMPS Molecular Dynamics SimulatorStress-NGWebP Image EncodeGraphicsMagickGCC 13Clang 17

NVIDIA GH200 Compilersminibude: OpenMP - BM1minibude: OpenMP - BM2stress-ng: CPU Cachestress-ng: Matrix Mathstress-ng: Vector Mathstress-ng: Floating Pointstress-ng: Vector Shufflestress-ng: Fused Multiply-Addstress-ng: Vector Floating Pointminibude: OpenMP - BM1minibude: OpenMP - BM2graphics-magick: Swirlgraphics-magick: Rotategraphics-magick: Sharpengraphics-magick: Enhancedgraphics-magick: Resizinggraphics-magick: Noise-Gaussiangraphics-magick: HWB Color Spacesecuremark: SecureMark-TLScompress-zstd: 19 - Compression Speedcompress-zstd: 19 - Decompression Speedcompress-zstd: 19, Long Mode - Compression Speedcompress-zstd: 19, Long Mode - Decompression Speedquantlib: Multi-Threadedquantlib: Single-Threadedwebp: Defaultwebp: Quality 100webp: Quality 100, Losslesswebp: Quality 100, Highest Compressionwebp: Quality 100, Lossless, Highest Compressiontscp: AI Chess Performancelammps: 20k Atomslammps: Rhodopsin Proteinliquid-dsp: 1 - 256 - 32liquid-dsp: 1 - 256 - 57liquid-dsp: 1 - 256 - 512liquid-dsp: 32 - 256 - 32liquid-dsp: 32 - 256 - 57liquid-dsp: 64 - 256 - 32liquid-dsp: 64 - 256 - 57liquid-dsp: 72 - 256 - 32liquid-dsp: 72 - 256 - 57liquid-dsp: 32 - 256 - 512liquid-dsp: 64 - 256 - 512liquid-dsp: 72 - 256 - 512lulesh: avifenc: 0avifenc: 2avifenc: 6, Losslessavifenc: 10, Losslessc-ray: Total Time - 4K, 16 Rays Per Pixelprimesieve: 1e12primesieve: 1e13encode-flac: WAV To FLACencode-mp3: WAV To MP3encode-opus: WAV To Opus Encodehelsing: 14 digitGCC 13Clang 1747.75548.041949580.78515044.15387369.4119830.0471014.33161511818.7283730.081193.8781201.02736721764882217080441920473126571814.71237.48.571283.6232068.23456.013.959.441.283.860.52207840748.23355.36045523000264706673194567136283333379503333326715333331587900000295233333317678000009593100019188666721550000048090.643109.67467.0733.7542.8516.0062.89135.19116.8725.47433.03768.10262.07560.979932492.75550915.68450187.9419566.28157339813.59141522.241551.8801524.466374818201761154279191440436026749814.91031.18.701094.4249451.33740.415.4210.191.314.660.56224807349.52156.3356892100036488333341486720662333331096366667403283333321827000004386000000240640000010285333320627000023137000047590.091122.87179.3713.7172.8716.7492.92935.59716.0786.28731.42084.332OpenBenchmarking.org

miniBUDE

MiniBUDE is a mini application for the the core computation of the Bristol University Docking Engine (BUDE). This test profile currently makes use of the OpenMP implementation of miniBUDE for CPU benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBillion Interactions/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM1Clang 17GCC 131428425670SE +/- 0.06, N = 3SE +/- 0.10, N = 362.0847.761. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -mcpu=native -lm

OpenBenchmarking.orgBillion Interactions/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM2Clang 17GCC 131428425670SE +/- 0.68, N = 3SE +/- 0.09, N = 360.9848.041. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -mcpu=native -lm

Stress-NG

Stress-NG is a Linux stress tool developed by Colin Ian King. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: CPU CacheGCC 13Clang 17200K400K600K800K1000KSE +/- 34515.94, N = 15SE +/- 44765.07, N = 15949580.78932492.75

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Matrix MathClang 17GCC 13120K240K360K480K600KSE +/- 356.30, N = 3SE +/- 2880.52, N = 3550915.68515044.15

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Vector MathClang 17GCC 13100K200K300K400K500KSE +/- 100.06, N = 3SE +/- 25.00, N = 3450187.94387369.41

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Floating PointGCC 13Clang 174K8K12K16K20KSE +/- 0.79, N = 3SE +/- 4.32, N = 319830.0419566.28

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Vector ShuffleGCC 1315K30K45K60K75KSE +/- 173.05, N = 371014.331. (CXX) g++ options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -std=gnu99 -U_FORTIFY_SOURCE -O2 -lc

Test: Vector Shuffle

Clang 17: The test run did not produce a result.

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Fused Multiply-AddGCC 13Clang 1730M60M90M120M150MSE +/- 1745267.88, N = 3SE +/- 1087775.35, N = 15161511818.72157339813.59

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Vector Floating PointClang 17GCC 1330K60K90K120K150KSE +/- 38.13, N = 3SE +/- 275.28, N = 3141522.2483730.08

miniBUDE

MiniBUDE is a mini application for the the core computation of the Bristol University Docking Engine (BUDE). This test profile currently makes use of the OpenMP implementation of miniBUDE for CPU benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFInst/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM1Clang 17GCC 1330060090012001500SE +/- 1.53, N = 3SE +/- 2.51, N = 31551.881193.881. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -mcpu=native -lm

OpenBenchmarking.orgGFInst/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM2Clang 17GCC 1330060090012001500SE +/- 16.91, N = 3SE +/- 2.12, N = 31524.471201.031. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -mcpu=native -lm

GraphicsMagick

This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: SwirlClang 17GCC 138001600240032004000SE +/- 48.77, N = 3SE +/- 5.24, N = 3374836721. (CC) gcc options: -fopenmp -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: RotateClang 17GCC 13400800120016002000SE +/- 25.64, N = 3SE +/- 9.35, N = 3182017641. (CC) gcc options: -fopenmp -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: SharpenClang 17GCC 13400800120016002000SE +/- 11.15, N = 15SE +/- 1.86, N = 317618821. (CC) gcc options: -fopenmp -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: EnhancedGCC 13Clang 175001000150020002500SE +/- 11.57, N = 3SE +/- 14.05, N = 3217015421. (CC) gcc options: -fopenmp -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: ResizingGCC 13Clang 172K4K6K8K10KSE +/- 39.89, N = 3SE +/- 70.92, N = 3804479191. (CC) gcc options: -fopenmp -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: Noise-GaussianGCC 13Clang 17400800120016002000SE +/- 0.67, N = 3SE +/- 4.67, N = 3192014401. (CC) gcc options: -fopenmp -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: HWB Color SpaceGCC 13Clang 1710002000300040005000SE +/- 41.68, N = 3SE +/- 16.18, N = 3473143601. (CC) gcc options: -fopenmp -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread

SecureMark

SecureMark is an objective, standardized benchmarking framework for measuring the efficiency of cryptographic processing solutions developed by EEMBC. SecureMark-TLS is benchmarking Transport Layer Security performance with a focus on IoT/edge computing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgmarks, More Is BetterSecureMark 1.0.4Benchmark: SecureMark-TLSClang 17GCC 1360K120K180K240K300KSE +/- 54.60, N = 3SE +/- 525.41, N = 32674982657181. (CC) gcc options: -pedantic -O3

Zstd Compression

This test measures the time needed to compress/decompress a sample file (silesia.tar) using Zstd (Zstandard) compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19 - Compression SpeedClang 17GCC 1348121620SE +/- 0.17, N = 3SE +/- 0.12, N = 314.914.7-Qunused-arguments1. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lz -llzma

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19 - Decompression SpeedGCC 13Clang 1730060090012001500SE +/- 13.01, N = 3SE +/- 4.91, N = 31237.41031.1-Qunused-arguments1. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lz -llzma

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19, Long Mode - Compression SpeedClang 17GCC 13246810SE +/- 0.00, N = 3SE +/- 0.00, N = 38.708.57-Qunused-arguments1. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lz -llzma

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19, Long Mode - Decompression SpeedGCC 13Clang 1730060090012001500SE +/- 3.15, N = 3SE +/- 2.48, N = 31283.61094.4-Qunused-arguments1. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lz -llzma

QuantLib

QuantLib is an open-source library/framework around quantitative finance for modeling, trading and risk management scenarios. QuantLib is written in C++ with Boost and its built-in benchmark used reports the QuantLib Benchmark Index benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMFLOPS, More Is BetterQuantLib 1.32Configuration: Multi-ThreadedClang 17GCC 1350K100K150K200K250KSE +/- 2974.17, N = 3SE +/- 3083.44, N = 3249451.3232068.21. (CXX) g++ options: -O3 -march=native -fPIE -pie

OpenBenchmarking.orgMFLOPS, More Is BetterQuantLib 1.32Configuration: Single-ThreadedClang 17GCC 138001600240032004000SE +/- 37.89, N = 6SE +/- 35.85, N = 33740.43456.01. (CXX) g++ options: -O3 -march=native -fPIE -pie

WebP Image Encode

This is a test of Google's libwebp with the cwebp image encode utility and using a sample 6000x4000 pixel JPEG image as the input. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMP/s, More Is BetterWebP Image Encode 1.2.4Encode Settings: DefaultClang 17GCC 1348121620SE +/- 0.01, N = 3SE +/- 0.03, N = 315.4213.951. (CC) gcc options: -fvisibility=hidden -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm

OpenBenchmarking.orgMP/s, More Is BetterWebP Image Encode 1.2.4Encode Settings: Quality 100Clang 17GCC 133691215SE +/- 0.00, N = 3SE +/- 0.01, N = 310.199.441. (CC) gcc options: -fvisibility=hidden -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm

OpenBenchmarking.orgMP/s, More Is BetterWebP Image Encode 1.2.4Encode Settings: Quality 100, LosslessClang 17GCC 130.29480.58960.88441.17921.474SE +/- 0.00, N = 3SE +/- 0.00, N = 31.311.281. (CC) gcc options: -fvisibility=hidden -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm

OpenBenchmarking.orgMP/s, More Is BetterWebP Image Encode 1.2.4Encode Settings: Quality 100, Highest CompressionClang 17GCC 131.04852.0973.14554.1945.2425SE +/- 0.00, N = 3SE +/- 0.00, N = 34.663.861. (CC) gcc options: -fvisibility=hidden -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm

OpenBenchmarking.orgMP/s, More Is BetterWebP Image Encode 1.2.4Encode Settings: Quality 100, Lossless, Highest CompressionClang 17GCC 130.1260.2520.3780.5040.63SE +/- 0.00, N = 3SE +/- 0.00, N = 30.560.521. (CC) gcc options: -fvisibility=hidden -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm

TSCP

This is a performance test of TSCP, Tom Kerrigan's Simple Chess Program, which has a built-in performance benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterTSCP 1.81AI Chess PerformanceClang 17GCC 13500K1000K1500K2000K2500KSE +/- 0.00, N = 5SE +/- 0.00, N = 5224807320784071. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -march=native

LAMMPS Molecular Dynamics Simulator

LAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 23Jun2022Model: 20k AtomsClang 17GCC 131122334455SE +/- 0.56, N = 3SE +/- 0.15, N = 349.5248.231. (CXX) g++ options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm -ldl

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 23Jun2022Model: Rhodopsin ProteinClang 17GCC 131326395265SE +/- 0.11, N = 3SE +/- 0.12, N = 356.3455.361. (CXX) g++ options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm -ldl

Liquid-DSP

LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 1 - Buffer Length: 256 - Filter Length: 32Clang 17GCC 1315M30M45M60M75MSE +/- 6082.76, N = 3SE +/- 30138.57, N = 368921000455230001. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 1 - Buffer Length: 256 - Filter Length: 57Clang 17GCC 138M16M24M32M40MSE +/- 1666.67, N = 3SE +/- 14240.01, N = 336488333264706671. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 1 - Buffer Length: 256 - Filter Length: 512Clang 17GCC 13700K1400K2100K2800K3500KSE +/- 2635.86, N = 3SE +/- 1117.04, N = 3341486731945671. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 32 - Buffer Length: 256 - Filter Length: 32Clang 17GCC 13400M800M1200M1600M2000MSE +/- 993870.10, N = 3SE +/- 1337078.07, N = 3206623333313628333331. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 32 - Buffer Length: 256 - Filter Length: 57Clang 17GCC 13200M400M600M800M1000MSE +/- 120185.04, N = 3SE +/- 153441.99, N = 310963666677950333331. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 32Clang 17GCC 13900M1800M2700M3600M4500MSE +/- 37196788.99, N = 3SE +/- 18636374.23, N = 3403283333326715333331. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 57Clang 17GCC 13500M1000M1500M2000M2500MSE +/- 4628534.69, N = 3SE +/- 57735.03, N = 3218270000015879000001. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 72 - Buffer Length: 256 - Filter Length: 32Clang 17GCC 13900M1800M2700M3600M4500MSE +/- 26602506.15, N = 3SE +/- 13903516.74, N = 3438600000029523333331. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 72 - Buffer Length: 256 - Filter Length: 57Clang 17GCC 13500M1000M1500M2000M2500MSE +/- 8154140.05, N = 3SE +/- 5356304.70, N = 3240640000017678000001. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 32 - Buffer Length: 256 - Filter Length: 512Clang 17GCC 1320M40M60M80M100MSE +/- 196751.39, N = 3SE +/- 129816.54, N = 3102853333959310001. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 512Clang 17GCC 1340M80M120M160M200MSE +/- 37859.39, N = 3SE +/- 67659.28, N = 32062700001918866671. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 72 - Buffer Length: 256 - Filter Length: 512Clang 17GCC 1350M100M150M200M250MSE +/- 91651.51, N = 3SE +/- 79372.54, N = 32313700002155000001. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

LULESH

LULESH is the Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgz/s, More Is BetterLULESH 2.0.3GCC 13Clang 1710K20K30K40K50KSE +/- 108.54, N = 3SE +/- 108.91, N = 348090.6447590.091. (CXX) g++ options: -O3 -fopenmp -lm -lmpi_cxx -lmpi

libavif avifenc

This is a test of the AOMedia libavif library testing the encoding of a JPEG image to AV1 Image Format (AVIF). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 1.0Encoder Speed: 0GCC 13Clang 17306090120150SE +/- 0.43, N = 3SE +/- 0.34, N = 3109.67122.871. (CXX) g++ options: -O3 -fPIC -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 1.0Encoder Speed: 2GCC 13Clang 1720406080100SE +/- 0.03, N = 3SE +/- 0.27, N = 367.0779.371. (CXX) g++ options: -O3 -fPIC -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 1.0Encoder Speed: 6, LosslessClang 17GCC 130.84471.68942.53413.37884.2235SE +/- 0.008, N = 3SE +/- 0.007, N = 33.7173.7541. (CXX) g++ options: -O3 -fPIC -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 1.0Encoder Speed: 10, LosslessGCC 13Clang 170.6461.2921.9382.5843.23SE +/- 0.006, N = 3SE +/- 0.003, N = 32.8512.8711. (CXX) g++ options: -O3 -fPIC -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm

C-Ray

This is a test of C-Ray, a simple raytracer designed to test the floating-point CPU performance. This test is multi-threaded (16 threads per core), will shoot 8 rays per pixel for anti-aliasing, and will generate a 1600 x 1200 image. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total Time - 4K, 16 Rays Per PixelGCC 13Clang 17246810SE +/- 0.003, N = 3SE +/- 0.009, N = 36.0066.7491. (CC) gcc options: -lm -lpthread -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2

Primesieve

Primesieve generates prime numbers using a highly optimized sieve of Eratosthenes implementation. Primesieve primarily benchmarks the CPU's L1/L2 cache performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterPrimesieve 8.0Length: 1e12GCC 13Clang 170.6591.3181.9772.6363.295SE +/- 0.002, N = 3SE +/- 0.001, N = 32.8912.9291. (CXX) g++ options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2

OpenBenchmarking.orgSeconds, Fewer Is BetterPrimesieve 8.0Length: 1e13GCC 13Clang 17816243240SE +/- 0.42, N = 3SE +/- 0.34, N = 335.1935.601. (CXX) g++ options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2

FLAC Audio Encoding

This test times how long it takes to encode a sample WAV file to FLAC audio format ten times using the --best preset settings. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterFLAC Audio Encoding 1.4WAV To FLACClang 17GCC 1348121620SE +/- 0.13, N = 9SE +/- 0.19, N = 516.0816.871. (CXX) g++ options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -fvisibility=hidden -logg -lm

LAME MP3 Encoding

LAME is an MP3 encoder licensed under the LGPL. This test measures the time required to encode a WAV file to MP3 format. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterLAME MP3 Encoding 3.100WAV To MP3GCC 13Clang 17246810SE +/- 0.003, N = 3SE +/- 0.001, N = 35.4746.287-ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr-lncurses1. (CC) gcc options: -O3 -pipe -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm

Opus Codec Encoding

Opus is an open audio codec. Opus is a lossy audio compression format designed primarily for interactive real-time applications over the Internet. This test uses Opus-Tools and measures the time required to encode a WAV file to Opus five times. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterOpus Codec Encoding 1.4WAV To Opus EncodeClang 17GCC 13816243240SE +/- 0.01, N = 5SE +/- 0.02, N = 531.4233.041. (CXX) g++ options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -fvisibility=hidden -logg -lm

Helsing

Helsing is an open-source POSIX vampire number generator. This test profile measures the time it takes to generate vampire numbers between varying numbers of digits. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterHelsing 1.0-betaDigit Range: 14 digitGCC 13Clang 1720406080100SE +/- 0.40, N = 15SE +/- 0.72, N = 368.1084.331. (CC) gcc options: -O2 -pthread