NVIDIA GH200 Compilers

Clang and GCC benchmarks by Michael Larabel for a future article. ARMv8 Neoverse-V2 testing with a Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS) and ASPEED on Ubuntu 23.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2402098-NE-NVIDIAGH291.

NVIDIA GH200 CompilersProcessorMotherboardMemoryDiskGraphicsNetworkOSKernelCompilerFile-SystemScreen ResolutionGCC 13Clang 17ARMv8 Neoverse-V2 @ 3.39GHz (72 Cores)Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS)1 x 480GB DRAM-6400MT/s960GB SAMSUNG MZ1L2960HCJR-00A07 + 1920GB SAMSUNG MZTL21T9ASPEED2 x Mellanox MT2910 + 2 x QLogic FastLinQ QL41000 10/25/40/50GbEUbuntu 23.106.8.0-060800rc3daily20240208-generic-64k (aarch64)GCC 13.2.0ext41920x1200Clang 17.0.2OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseEnvironment Details- CXXFLAGS="-O3 -mtune=neoverse-v2 -mcpu=neoverse-v2" CFLAGS="-O3 -mtune=neoverse-v2 -mcpu=neoverse-v2" Compiler Details- GCC 13: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v Processor Details- Scaling Governor: cppc_cpufreq performance (Boost: Disabled)Python Details- Python 3.11.6Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected

NVIDIA GH200 Compilersquantlib: Multi-Threadedquantlib: Single-Threadedminibude: OpenMP - BM1minibude: OpenMP - BM1minibude: OpenMP - BM2minibude: OpenMP - BM2lammps: 20k Atomslammps: Rhodopsin Proteinlulesh: compress-zstd: 19 - Compression Speedcompress-zstd: 19 - Decompression Speedcompress-zstd: 19, Long Mode - Compression Speedcompress-zstd: 19, Long Mode - Decompression Speedwebp: Defaultwebp: Quality 100webp: Quality 100, Losslesswebp: Quality 100, Highest Compressionwebp: Quality 100, Lossless, Highest Compressiontscp: AI Chess Performancegraphics-magick: Swirlgraphics-magick: Rotategraphics-magick: Sharpengraphics-magick: Enhancedgraphics-magick: Resizinggraphics-magick: Noise-Gaussiangraphics-magick: HWB Color Spaceavifenc: 0avifenc: 2avifenc: 6, Losslessavifenc: 10, Losslessc-ray: Total Time - 4K, 16 Rays Per Pixelprimesieve: 1e12primesieve: 1e13encode-flac: WAV To FLACencode-mp3: WAV To MP3encode-opus: WAV To Opus Encodehelsing: 14 digitsecuremark: SecureMark-TLSliquid-dsp: 1 - 256 - 32liquid-dsp: 1 - 256 - 57liquid-dsp: 1 - 256 - 512liquid-dsp: 32 - 256 - 32liquid-dsp: 32 - 256 - 57liquid-dsp: 64 - 256 - 32liquid-dsp: 64 - 256 - 57liquid-dsp: 72 - 256 - 32liquid-dsp: 72 - 256 - 57liquid-dsp: 32 - 256 - 512liquid-dsp: 64 - 256 - 512liquid-dsp: 72 - 256 - 512stress-ng: CPU Cachestress-ng: Matrix Mathstress-ng: Vector Mathstress-ng: Floating Pointstress-ng: Vector Shufflestress-ng: Fused Multiply-Addstress-ng: Vector Floating PointGCC 13Clang 17232068.23456.01193.87847.7551201.02748.04148.23355.36048090.64314.71237.48.571283.613.959.441.283.860.522078407367217648822170804419204731109.67467.0733.7542.8516.0062.89135.19116.8725.47433.03768.102265718455230002647066731945671362833333795033333267153333315879000002952333333176780000095931000191886667215500000949580.78515044.15387369.4119830.0471014.33161511818.7283730.08249451.33740.41551.88062.0751524.46660.97949.52156.33547590.09114.91031.18.701094.415.4210.191.314.660.5622480733748182017611542791914404360122.87179.3713.7172.8716.7492.92935.59716.0786.28731.42084.33226749868921000364883333414867206623333310963666674032833333218270000043860000002406400000102853333206270000231370000932492.75550915.68450187.9419566.28157339813.59141522.24OpenBenchmarking.org

QuantLib

Configuration: Multi-Threaded

OpenBenchmarking.orgMFLOPS, More Is BetterQuantLib 1.32Configuration: Multi-ThreadedGCC 13Clang 1750K100K150K200K250KSE +/- 3083.44, N = 3SE +/- 2974.17, N = 3232068.2249451.31. (CXX) g++ options: -O3 -march=native -fPIE -pie

QuantLib

Configuration: Single-Threaded

OpenBenchmarking.orgMFLOPS, More Is BetterQuantLib 1.32Configuration: Single-ThreadedGCC 13Clang 178001600240032004000SE +/- 35.85, N = 3SE +/- 37.89, N = 63456.03740.41. (CXX) g++ options: -O3 -march=native -fPIE -pie

miniBUDE

Implementation: OpenMP - Input Deck: BM1

OpenBenchmarking.orgGFInst/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM1GCC 13Clang 1730060090012001500SE +/- 2.51, N = 3SE +/- 1.53, N = 31193.881551.881. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -mcpu=native -lm

miniBUDE

Implementation: OpenMP - Input Deck: BM1

OpenBenchmarking.orgBillion Interactions/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM1GCC 13Clang 171428425670SE +/- 0.10, N = 3SE +/- 0.06, N = 347.7662.081. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -mcpu=native -lm

miniBUDE

Implementation: OpenMP - Input Deck: BM2

OpenBenchmarking.orgGFInst/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM2GCC 13Clang 1730060090012001500SE +/- 2.12, N = 3SE +/- 16.91, N = 31201.031524.471. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -mcpu=native -lm

miniBUDE

Implementation: OpenMP - Input Deck: BM2

OpenBenchmarking.orgBillion Interactions/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM2GCC 13Clang 171428425670SE +/- 0.09, N = 3SE +/- 0.68, N = 348.0460.981. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -mcpu=native -lm

LAMMPS Molecular Dynamics Simulator

Model: 20k Atoms

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 23Jun2022Model: 20k AtomsGCC 13Clang 171122334455SE +/- 0.15, N = 3SE +/- 0.56, N = 348.2349.521. (CXX) g++ options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm -ldl

LAMMPS Molecular Dynamics Simulator

Model: Rhodopsin Protein

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 23Jun2022Model: Rhodopsin ProteinGCC 13Clang 171326395265SE +/- 0.12, N = 3SE +/- 0.11, N = 355.3656.341. (CXX) g++ options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm -ldl

LULESH

OpenBenchmarking.orgz/s, More Is BetterLULESH 2.0.3GCC 13Clang 1710K20K30K40K50KSE +/- 108.54, N = 3SE +/- 108.91, N = 348090.6447590.091. (CXX) g++ options: -O3 -fopenmp -lm -lmpi_cxx -lmpi

Zstd Compression

Compression Level: 19 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19 - Compression SpeedGCC 13Clang 1748121620SE +/- 0.12, N = 3SE +/- 0.17, N = 314.714.9-Qunused-arguments1. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lz -llzma

Zstd Compression

Compression Level: 19 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19 - Decompression SpeedGCC 13Clang 1730060090012001500SE +/- 13.01, N = 3SE +/- 4.91, N = 31237.41031.1-Qunused-arguments1. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lz -llzma

Zstd Compression

Compression Level: 19, Long Mode - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19, Long Mode - Compression SpeedGCC 13Clang 17246810SE +/- 0.00, N = 3SE +/- 0.00, N = 38.578.70-Qunused-arguments1. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lz -llzma

Zstd Compression

Compression Level: 19, Long Mode - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19, Long Mode - Decompression SpeedGCC 13Clang 1730060090012001500SE +/- 3.15, N = 3SE +/- 2.48, N = 31283.61094.4-Qunused-arguments1. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lz -llzma

WebP Image Encode

Encode Settings: Default

OpenBenchmarking.orgMP/s, More Is BetterWebP Image Encode 1.2.4Encode Settings: DefaultGCC 13Clang 1748121620SE +/- 0.03, N = 3SE +/- 0.01, N = 313.9515.421. (CC) gcc options: -fvisibility=hidden -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm

WebP Image Encode

Encode Settings: Quality 100

OpenBenchmarking.orgMP/s, More Is BetterWebP Image Encode 1.2.4Encode Settings: Quality 100GCC 13Clang 173691215SE +/- 0.01, N = 3SE +/- 0.00, N = 39.4410.191. (CC) gcc options: -fvisibility=hidden -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm

WebP Image Encode

Encode Settings: Quality 100, Lossless

OpenBenchmarking.orgMP/s, More Is BetterWebP Image Encode 1.2.4Encode Settings: Quality 100, LosslessGCC 13Clang 170.29480.58960.88441.17921.474SE +/- 0.00, N = 3SE +/- 0.00, N = 31.281.311. (CC) gcc options: -fvisibility=hidden -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm

WebP Image Encode

Encode Settings: Quality 100, Highest Compression

OpenBenchmarking.orgMP/s, More Is BetterWebP Image Encode 1.2.4Encode Settings: Quality 100, Highest CompressionGCC 13Clang 171.04852.0973.14554.1945.2425SE +/- 0.00, N = 3SE +/- 0.00, N = 33.864.661. (CC) gcc options: -fvisibility=hidden -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm

WebP Image Encode

Encode Settings: Quality 100, Lossless, Highest Compression

OpenBenchmarking.orgMP/s, More Is BetterWebP Image Encode 1.2.4Encode Settings: Quality 100, Lossless, Highest CompressionGCC 13Clang 170.1260.2520.3780.5040.63SE +/- 0.00, N = 3SE +/- 0.00, N = 30.520.561. (CC) gcc options: -fvisibility=hidden -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm

TSCP

AI Chess Performance

OpenBenchmarking.orgNodes Per Second, More Is BetterTSCP 1.81AI Chess PerformanceGCC 13Clang 17500K1000K1500K2000K2500KSE +/- 0.00, N = 5SE +/- 0.00, N = 5207840722480731. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -march=native

GraphicsMagick

Operation: Swirl

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: SwirlGCC 13Clang 178001600240032004000SE +/- 5.24, N = 3SE +/- 48.77, N = 3367237481. (CC) gcc options: -fopenmp -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread

GraphicsMagick

Operation: Rotate

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: RotateGCC 13Clang 17400800120016002000SE +/- 9.35, N = 3SE +/- 25.64, N = 3176418201. (CC) gcc options: -fopenmp -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread

GraphicsMagick

Operation: Sharpen

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: SharpenGCC 13Clang 17400800120016002000SE +/- 1.86, N = 3SE +/- 11.15, N = 1588217611. (CC) gcc options: -fopenmp -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread

GraphicsMagick

Operation: Enhanced

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: EnhancedGCC 13Clang 175001000150020002500SE +/- 11.57, N = 3SE +/- 14.05, N = 3217015421. (CC) gcc options: -fopenmp -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread

GraphicsMagick

Operation: Resizing

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: ResizingGCC 13Clang 172K4K6K8K10KSE +/- 39.89, N = 3SE +/- 70.92, N = 3804479191. (CC) gcc options: -fopenmp -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread

GraphicsMagick

Operation: Noise-Gaussian

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: Noise-GaussianGCC 13Clang 17400800120016002000SE +/- 0.67, N = 3SE +/- 4.67, N = 3192014401. (CC) gcc options: -fopenmp -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread

GraphicsMagick

Operation: HWB Color Space

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: HWB Color SpaceGCC 13Clang 1710002000300040005000SE +/- 41.68, N = 3SE +/- 16.18, N = 3473143601. (CC) gcc options: -fopenmp -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread

libavif avifenc

Encoder Speed: 0

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 1.0Encoder Speed: 0GCC 13Clang 17306090120150SE +/- 0.43, N = 3SE +/- 0.34, N = 3109.67122.871. (CXX) g++ options: -O3 -fPIC -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm

libavif avifenc

Encoder Speed: 2

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 1.0Encoder Speed: 2GCC 13Clang 1720406080100SE +/- 0.03, N = 3SE +/- 0.27, N = 367.0779.371. (CXX) g++ options: -O3 -fPIC -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm

libavif avifenc

Encoder Speed: 6, Lossless

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 1.0Encoder Speed: 6, LosslessGCC 13Clang 170.84471.68942.53413.37884.2235SE +/- 0.007, N = 3SE +/- 0.008, N = 33.7543.7171. (CXX) g++ options: -O3 -fPIC -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm

libavif avifenc

Encoder Speed: 10, Lossless

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 1.0Encoder Speed: 10, LosslessGCC 13Clang 170.6461.2921.9382.5843.23SE +/- 0.006, N = 3SE +/- 0.003, N = 32.8512.8711. (CXX) g++ options: -O3 -fPIC -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm

C-Ray

Total Time - 4K, 16 Rays Per Pixel

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total Time - 4K, 16 Rays Per PixelGCC 13Clang 17246810SE +/- 0.003, N = 3SE +/- 0.009, N = 36.0066.7491. (CC) gcc options: -lm -lpthread -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2

Primesieve

Length: 1e12

OpenBenchmarking.orgSeconds, Fewer Is BetterPrimesieve 8.0Length: 1e12GCC 13Clang 170.6591.3181.9772.6363.295SE +/- 0.002, N = 3SE +/- 0.001, N = 32.8912.9291. (CXX) g++ options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2

Primesieve

Length: 1e13

OpenBenchmarking.orgSeconds, Fewer Is BetterPrimesieve 8.0Length: 1e13GCC 13Clang 17816243240SE +/- 0.42, N = 3SE +/- 0.34, N = 335.1935.601. (CXX) g++ options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2

FLAC Audio Encoding

WAV To FLAC

OpenBenchmarking.orgSeconds, Fewer Is BetterFLAC Audio Encoding 1.4WAV To FLACGCC 13Clang 1748121620SE +/- 0.19, N = 5SE +/- 0.13, N = 916.8716.081. (CXX) g++ options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -fvisibility=hidden -logg -lm

LAME MP3 Encoding

WAV To MP3

OpenBenchmarking.orgSeconds, Fewer Is BetterLAME MP3 Encoding 3.100WAV To MP3GCC 13Clang 17246810SE +/- 0.003, N = 3SE +/- 0.001, N = 35.4746.287-ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr-lncurses1. (CC) gcc options: -O3 -pipe -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm

Opus Codec Encoding

WAV To Opus Encode

OpenBenchmarking.orgSeconds, Fewer Is BetterOpus Codec Encoding 1.4WAV To Opus EncodeGCC 13Clang 17816243240SE +/- 0.02, N = 5SE +/- 0.01, N = 533.0431.421. (CXX) g++ options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -fvisibility=hidden -logg -lm

Helsing

Digit Range: 14 digit

OpenBenchmarking.orgSeconds, Fewer Is BetterHelsing 1.0-betaDigit Range: 14 digitGCC 13Clang 1720406080100SE +/- 0.40, N = 15SE +/- 0.72, N = 368.1084.331. (CC) gcc options: -O2 -pthread

SecureMark

Benchmark: SecureMark-TLS

OpenBenchmarking.orgmarks, More Is BetterSecureMark 1.0.4Benchmark: SecureMark-TLSGCC 13Clang 1760K120K180K240K300KSE +/- 525.41, N = 3SE +/- 54.60, N = 32657182674981. (CC) gcc options: -pedantic -O3

Liquid-DSP

Threads: 1 - Buffer Length: 256 - Filter Length: 32

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 1 - Buffer Length: 256 - Filter Length: 32GCC 13Clang 1715M30M45M60M75MSE +/- 30138.57, N = 3SE +/- 6082.76, N = 345523000689210001. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 1 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 1 - Buffer Length: 256 - Filter Length: 57GCC 13Clang 178M16M24M32M40MSE +/- 14240.01, N = 3SE +/- 1666.67, N = 326470667364883331. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 1 - Buffer Length: 256 - Filter Length: 512

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 1 - Buffer Length: 256 - Filter Length: 512GCC 13Clang 17700K1400K2100K2800K3500KSE +/- 1117.04, N = 3SE +/- 2635.86, N = 3319456734148671. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 32 - Buffer Length: 256 - Filter Length: 32

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 32 - Buffer Length: 256 - Filter Length: 32GCC 13Clang 17400M800M1200M1600M2000MSE +/- 1337078.07, N = 3SE +/- 993870.10, N = 3136283333320662333331. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 32 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 32 - Buffer Length: 256 - Filter Length: 57GCC 13Clang 17200M400M600M800M1000MSE +/- 153441.99, N = 3SE +/- 120185.04, N = 379503333310963666671. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 64 - Buffer Length: 256 - Filter Length: 32

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 32GCC 13Clang 17900M1800M2700M3600M4500MSE +/- 18636374.23, N = 3SE +/- 37196788.99, N = 3267153333340328333331. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 64 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 57GCC 13Clang 17500M1000M1500M2000M2500MSE +/- 57735.03, N = 3SE +/- 4628534.69, N = 3158790000021827000001. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 72 - Buffer Length: 256 - Filter Length: 32

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 72 - Buffer Length: 256 - Filter Length: 32GCC 13Clang 17900M1800M2700M3600M4500MSE +/- 13903516.74, N = 3SE +/- 26602506.15, N = 3295233333343860000001. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 72 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 72 - Buffer Length: 256 - Filter Length: 57GCC 13Clang 17500M1000M1500M2000M2500MSE +/- 5356304.70, N = 3SE +/- 8154140.05, N = 3176780000024064000001. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 32 - Buffer Length: 256 - Filter Length: 512

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 32 - Buffer Length: 256 - Filter Length: 512GCC 13Clang 1720M40M60M80M100MSE +/- 129816.54, N = 3SE +/- 196751.39, N = 3959310001028533331. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 64 - Buffer Length: 256 - Filter Length: 512

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 512GCC 13Clang 1740M80M120M160M200MSE +/- 67659.28, N = 3SE +/- 37859.39, N = 31918866672062700001. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 72 - Buffer Length: 256 - Filter Length: 512

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 72 - Buffer Length: 256 - Filter Length: 512GCC 13Clang 1750M100M150M200M250MSE +/- 79372.54, N = 3SE +/- 91651.51, N = 32155000002313700001. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

Stress-NG

Test: CPU Cache

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: CPU CacheGCC 13Clang 17200K400K600K800K1000KSE +/- 34515.94, N = 15SE +/- 44765.07, N = 15949580.78932492.75

Stress-NG

Test: Matrix Math

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Matrix MathGCC 13Clang 17120K240K360K480K600KSE +/- 2880.52, N = 3SE +/- 356.30, N = 3515044.15550915.68

Stress-NG

Test: Vector Math

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Vector MathGCC 13Clang 17100K200K300K400K500KSE +/- 25.00, N = 3SE +/- 100.06, N = 3387369.41450187.94

Stress-NG

Test: Floating Point

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Floating PointGCC 13Clang 174K8K12K16K20KSE +/- 0.79, N = 3SE +/- 4.32, N = 319830.0419566.28

Stress-NG

Test: Vector Shuffle

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Vector ShuffleGCC 1315K30K45K60K75KSE +/- 173.05, N = 371014.331. (CXX) g++ options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -std=gnu99 -U_FORTIFY_SOURCE -O2 -lc

Stress-NG

Test: Fused Multiply-Add

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Fused Multiply-AddGCC 13Clang 1730M60M90M120M150MSE +/- 1745267.88, N = 3SE +/- 1087775.35, N = 15161511818.72157339813.59

Stress-NG

Test: Vector Floating Point

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Vector Floating PointGCC 13Clang 1730K60K90K120K150KSE +/- 275.28, N = 3SE +/- 38.13, N = 383730.08141522.24


Phoronix Test Suite v10.8.4