NVIDIA GH200 Compilers

Clang and GCC benchmarks by Michael Larabel for a future article. ARMv8 Neoverse-V2 testing with a Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS) and ASPEED on Ubuntu 23.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2402098-NE-NVIDIAGH291&sor.

NVIDIA GH200 CompilersProcessorMotherboardMemoryDiskGraphicsNetworkOSKernelCompilerFile-SystemScreen ResolutionGCC 13Clang 17ARMv8 Neoverse-V2 @ 3.39GHz (72 Cores)Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS)1 x 480GB DRAM-6400MT/s960GB SAMSUNG MZ1L2960HCJR-00A07 + 1920GB SAMSUNG MZTL21T9ASPEED2 x Mellanox MT2910 + 2 x QLogic FastLinQ QL41000 10/25/40/50GbEUbuntu 23.106.8.0-060800rc3daily20240208-generic-64k (aarch64)GCC 13.2.0ext41920x1200Clang 17.0.2OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseEnvironment Details- CXXFLAGS="-O3 -mtune=neoverse-v2 -mcpu=neoverse-v2" CFLAGS="-O3 -mtune=neoverse-v2 -mcpu=neoverse-v2" Compiler Details- GCC 13: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v Processor Details- Scaling Governor: cppc_cpufreq performance (Boost: Disabled)Python Details- Python 3.11.6Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected

NVIDIA GH200 Compilersquantlib: Multi-Threadedquantlib: Single-Threadedminibude: OpenMP - BM1minibude: OpenMP - BM1minibude: OpenMP - BM2minibude: OpenMP - BM2lammps: 20k Atomslammps: Rhodopsin Proteinlulesh: compress-zstd: 19 - Compression Speedcompress-zstd: 19 - Decompression Speedcompress-zstd: 19, Long Mode - Compression Speedcompress-zstd: 19, Long Mode - Decompression Speedwebp: Defaultwebp: Quality 100webp: Quality 100, Losslesswebp: Quality 100, Highest Compressionwebp: Quality 100, Lossless, Highest Compressiontscp: AI Chess Performancegraphics-magick: Swirlgraphics-magick: Rotategraphics-magick: Sharpengraphics-magick: Enhancedgraphics-magick: Resizinggraphics-magick: Noise-Gaussiangraphics-magick: HWB Color Spaceavifenc: 0avifenc: 2avifenc: 6, Losslessavifenc: 10, Losslessc-ray: Total Time - 4K, 16 Rays Per Pixelprimesieve: 1e12primesieve: 1e13encode-flac: WAV To FLACencode-mp3: WAV To MP3encode-opus: WAV To Opus Encodehelsing: 14 digitsecuremark: SecureMark-TLSliquid-dsp: 1 - 256 - 32liquid-dsp: 1 - 256 - 57liquid-dsp: 1 - 256 - 512liquid-dsp: 32 - 256 - 32liquid-dsp: 32 - 256 - 57liquid-dsp: 64 - 256 - 32liquid-dsp: 64 - 256 - 57liquid-dsp: 72 - 256 - 32liquid-dsp: 72 - 256 - 57liquid-dsp: 32 - 256 - 512liquid-dsp: 64 - 256 - 512liquid-dsp: 72 - 256 - 512stress-ng: CPU Cachestress-ng: Matrix Mathstress-ng: Vector Mathstress-ng: Floating Pointstress-ng: Vector Shufflestress-ng: Fused Multiply-Addstress-ng: Vector Floating PointGCC 13Clang 17232068.23456.01193.87847.7551201.02748.04148.23355.36048090.64314.71237.48.571283.613.959.441.283.860.522078407367217648822170804419204731109.67467.0733.7542.8516.0062.89135.19116.8725.47433.03768.102265718455230002647066731945671362833333795033333267153333315879000002952333333176780000095931000191886667215500000949580.78515044.15387369.4119830.0471014.33161511818.7283730.08249451.33740.41551.88062.0751524.46660.97949.52156.33547590.09114.91031.18.701094.415.4210.191.314.660.5622480733748182017611542791914404360122.87179.3713.7172.8716.7492.92935.59716.0786.28731.42084.33226749868921000364883333414867206623333310963666674032833333218270000043860000002406400000102853333206270000231370000932492.75550915.68450187.9419566.28157339813.59141522.24OpenBenchmarking.org

QuantLib

Configuration: Multi-Threaded

OpenBenchmarking.orgMFLOPS, More Is BetterQuantLib 1.32Configuration: Multi-ThreadedClang 17GCC 1350K100K150K200K250KSE +/- 2974.17, N = 3SE +/- 3083.44, N = 3249451.3232068.21. (CXX) g++ options: -O3 -march=native -fPIE -pie

QuantLib

Configuration: Single-Threaded

OpenBenchmarking.orgMFLOPS, More Is BetterQuantLib 1.32Configuration: Single-ThreadedClang 17GCC 138001600240032004000SE +/- 37.89, N = 6SE +/- 35.85, N = 33740.43456.01. (CXX) g++ options: -O3 -march=native -fPIE -pie

miniBUDE

Implementation: OpenMP - Input Deck: BM1

OpenBenchmarking.orgGFInst/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM1Clang 17GCC 1330060090012001500SE +/- 1.53, N = 3SE +/- 2.51, N = 31551.881193.881. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -mcpu=native -lm

miniBUDE

Implementation: OpenMP - Input Deck: BM1

OpenBenchmarking.orgBillion Interactions/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM1Clang 17GCC 131428425670SE +/- 0.06, N = 3SE +/- 0.10, N = 362.0847.761. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -mcpu=native -lm

miniBUDE

Implementation: OpenMP - Input Deck: BM2

OpenBenchmarking.orgGFInst/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM2Clang 17GCC 1330060090012001500SE +/- 16.91, N = 3SE +/- 2.12, N = 31524.471201.031. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -mcpu=native -lm

miniBUDE

Implementation: OpenMP - Input Deck: BM2

OpenBenchmarking.orgBillion Interactions/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM2Clang 17GCC 131428425670SE +/- 0.68, N = 3SE +/- 0.09, N = 360.9848.041. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -mcpu=native -lm

LAMMPS Molecular Dynamics Simulator

Model: 20k Atoms

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 23Jun2022Model: 20k AtomsClang 17GCC 131122334455SE +/- 0.56, N = 3SE +/- 0.15, N = 349.5248.231. (CXX) g++ options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm -ldl

LAMMPS Molecular Dynamics Simulator

Model: Rhodopsin Protein

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 23Jun2022Model: Rhodopsin ProteinClang 17GCC 131326395265SE +/- 0.11, N = 3SE +/- 0.12, N = 356.3455.361. (CXX) g++ options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm -ldl

LULESH

OpenBenchmarking.orgz/s, More Is BetterLULESH 2.0.3GCC 13Clang 1710K20K30K40K50KSE +/- 108.54, N = 3SE +/- 108.91, N = 348090.6447590.091. (CXX) g++ options: -O3 -fopenmp -lm -lmpi_cxx -lmpi

Zstd Compression

Compression Level: 19 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19 - Compression SpeedClang 17GCC 1348121620SE +/- 0.17, N = 3SE +/- 0.12, N = 314.914.7-Qunused-arguments1. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lz -llzma

Zstd Compression

Compression Level: 19 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19 - Decompression SpeedGCC 13Clang 1730060090012001500SE +/- 13.01, N = 3SE +/- 4.91, N = 31237.41031.1-Qunused-arguments1. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lz -llzma

Zstd Compression

Compression Level: 19, Long Mode - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19, Long Mode - Compression SpeedClang 17GCC 13246810SE +/- 0.00, N = 3SE +/- 0.00, N = 38.708.57-Qunused-arguments1. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lz -llzma

Zstd Compression

Compression Level: 19, Long Mode - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19, Long Mode - Decompression SpeedGCC 13Clang 1730060090012001500SE +/- 3.15, N = 3SE +/- 2.48, N = 31283.61094.4-Qunused-arguments1. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lz -llzma

WebP Image Encode

Encode Settings: Default

OpenBenchmarking.orgMP/s, More Is BetterWebP Image Encode 1.2.4Encode Settings: DefaultClang 17GCC 1348121620SE +/- 0.01, N = 3SE +/- 0.03, N = 315.4213.951. (CC) gcc options: -fvisibility=hidden -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm

WebP Image Encode

Encode Settings: Quality 100

OpenBenchmarking.orgMP/s, More Is BetterWebP Image Encode 1.2.4Encode Settings: Quality 100Clang 17GCC 133691215SE +/- 0.00, N = 3SE +/- 0.01, N = 310.199.441. (CC) gcc options: -fvisibility=hidden -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm

WebP Image Encode

Encode Settings: Quality 100, Lossless

OpenBenchmarking.orgMP/s, More Is BetterWebP Image Encode 1.2.4Encode Settings: Quality 100, LosslessClang 17GCC 130.29480.58960.88441.17921.474SE +/- 0.00, N = 3SE +/- 0.00, N = 31.311.281. (CC) gcc options: -fvisibility=hidden -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm

WebP Image Encode

Encode Settings: Quality 100, Highest Compression

OpenBenchmarking.orgMP/s, More Is BetterWebP Image Encode 1.2.4Encode Settings: Quality 100, Highest CompressionClang 17GCC 131.04852.0973.14554.1945.2425SE +/- 0.00, N = 3SE +/- 0.00, N = 34.663.861. (CC) gcc options: -fvisibility=hidden -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm

WebP Image Encode

Encode Settings: Quality 100, Lossless, Highest Compression

OpenBenchmarking.orgMP/s, More Is BetterWebP Image Encode 1.2.4Encode Settings: Quality 100, Lossless, Highest CompressionClang 17GCC 130.1260.2520.3780.5040.63SE +/- 0.00, N = 3SE +/- 0.00, N = 30.560.521. (CC) gcc options: -fvisibility=hidden -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm

TSCP

AI Chess Performance

OpenBenchmarking.orgNodes Per Second, More Is BetterTSCP 1.81AI Chess PerformanceClang 17GCC 13500K1000K1500K2000K2500KSE +/- 0.00, N = 5SE +/- 0.00, N = 5224807320784071. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -march=native

GraphicsMagick

Operation: Swirl

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: SwirlClang 17GCC 138001600240032004000SE +/- 48.77, N = 3SE +/- 5.24, N = 3374836721. (CC) gcc options: -fopenmp -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread

GraphicsMagick

Operation: Rotate

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: RotateClang 17GCC 13400800120016002000SE +/- 25.64, N = 3SE +/- 9.35, N = 3182017641. (CC) gcc options: -fopenmp -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread

GraphicsMagick

Operation: Sharpen

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: SharpenClang 17GCC 13400800120016002000SE +/- 11.15, N = 15SE +/- 1.86, N = 317618821. (CC) gcc options: -fopenmp -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread

GraphicsMagick

Operation: Enhanced

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: EnhancedGCC 13Clang 175001000150020002500SE +/- 11.57, N = 3SE +/- 14.05, N = 3217015421. (CC) gcc options: -fopenmp -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread

GraphicsMagick

Operation: Resizing

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: ResizingGCC 13Clang 172K4K6K8K10KSE +/- 39.89, N = 3SE +/- 70.92, N = 3804479191. (CC) gcc options: -fopenmp -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread

GraphicsMagick

Operation: Noise-Gaussian

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: Noise-GaussianGCC 13Clang 17400800120016002000SE +/- 0.67, N = 3SE +/- 4.67, N = 3192014401. (CC) gcc options: -fopenmp -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread

GraphicsMagick

Operation: HWB Color Space

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: HWB Color SpaceGCC 13Clang 1710002000300040005000SE +/- 41.68, N = 3SE +/- 16.18, N = 3473143601. (CC) gcc options: -fopenmp -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread

libavif avifenc

Encoder Speed: 0

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 1.0Encoder Speed: 0GCC 13Clang 17306090120150SE +/- 0.43, N = 3SE +/- 0.34, N = 3109.67122.871. (CXX) g++ options: -O3 -fPIC -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm

libavif avifenc

Encoder Speed: 2

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 1.0Encoder Speed: 2GCC 13Clang 1720406080100SE +/- 0.03, N = 3SE +/- 0.27, N = 367.0779.371. (CXX) g++ options: -O3 -fPIC -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm

libavif avifenc

Encoder Speed: 6, Lossless

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 1.0Encoder Speed: 6, LosslessClang 17GCC 130.84471.68942.53413.37884.2235SE +/- 0.008, N = 3SE +/- 0.007, N = 33.7173.7541. (CXX) g++ options: -O3 -fPIC -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm

libavif avifenc

Encoder Speed: 10, Lossless

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 1.0Encoder Speed: 10, LosslessGCC 13Clang 170.6461.2921.9382.5843.23SE +/- 0.006, N = 3SE +/- 0.003, N = 32.8512.8711. (CXX) g++ options: -O3 -fPIC -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm

C-Ray

Total Time - 4K, 16 Rays Per Pixel

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total Time - 4K, 16 Rays Per PixelGCC 13Clang 17246810SE +/- 0.003, N = 3SE +/- 0.009, N = 36.0066.7491. (CC) gcc options: -lm -lpthread -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2

Primesieve

Length: 1e12

OpenBenchmarking.orgSeconds, Fewer Is BetterPrimesieve 8.0Length: 1e12GCC 13Clang 170.6591.3181.9772.6363.295SE +/- 0.002, N = 3SE +/- 0.001, N = 32.8912.9291. (CXX) g++ options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2

Primesieve

Length: 1e13

OpenBenchmarking.orgSeconds, Fewer Is BetterPrimesieve 8.0Length: 1e13GCC 13Clang 17816243240SE +/- 0.42, N = 3SE +/- 0.34, N = 335.1935.601. (CXX) g++ options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2

FLAC Audio Encoding

WAV To FLAC

OpenBenchmarking.orgSeconds, Fewer Is BetterFLAC Audio Encoding 1.4WAV To FLACClang 17GCC 1348121620SE +/- 0.13, N = 9SE +/- 0.19, N = 516.0816.871. (CXX) g++ options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -fvisibility=hidden -logg -lm

LAME MP3 Encoding

WAV To MP3

OpenBenchmarking.orgSeconds, Fewer Is BetterLAME MP3 Encoding 3.100WAV To MP3GCC 13Clang 17246810SE +/- 0.003, N = 3SE +/- 0.001, N = 35.4746.287-ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr-lncurses1. (CC) gcc options: -O3 -pipe -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm

Opus Codec Encoding

WAV To Opus Encode

OpenBenchmarking.orgSeconds, Fewer Is BetterOpus Codec Encoding 1.4WAV To Opus EncodeClang 17GCC 13816243240SE +/- 0.01, N = 5SE +/- 0.02, N = 531.4233.041. (CXX) g++ options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -fvisibility=hidden -logg -lm

Helsing

Digit Range: 14 digit

OpenBenchmarking.orgSeconds, Fewer Is BetterHelsing 1.0-betaDigit Range: 14 digitGCC 13Clang 1720406080100SE +/- 0.40, N = 15SE +/- 0.72, N = 368.1084.331. (CC) gcc options: -O2 -pthread

SecureMark

Benchmark: SecureMark-TLS

OpenBenchmarking.orgmarks, More Is BetterSecureMark 1.0.4Benchmark: SecureMark-TLSClang 17GCC 1360K120K180K240K300KSE +/- 54.60, N = 3SE +/- 525.41, N = 32674982657181. (CC) gcc options: -pedantic -O3

Liquid-DSP

Threads: 1 - Buffer Length: 256 - Filter Length: 32

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 1 - Buffer Length: 256 - Filter Length: 32Clang 17GCC 1315M30M45M60M75MSE +/- 6082.76, N = 3SE +/- 30138.57, N = 368921000455230001. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 1 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 1 - Buffer Length: 256 - Filter Length: 57Clang 17GCC 138M16M24M32M40MSE +/- 1666.67, N = 3SE +/- 14240.01, N = 336488333264706671. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 1 - Buffer Length: 256 - Filter Length: 512

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 1 - Buffer Length: 256 - Filter Length: 512Clang 17GCC 13700K1400K2100K2800K3500KSE +/- 2635.86, N = 3SE +/- 1117.04, N = 3341486731945671. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 32 - Buffer Length: 256 - Filter Length: 32

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 32 - Buffer Length: 256 - Filter Length: 32Clang 17GCC 13400M800M1200M1600M2000MSE +/- 993870.10, N = 3SE +/- 1337078.07, N = 3206623333313628333331. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 32 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 32 - Buffer Length: 256 - Filter Length: 57Clang 17GCC 13200M400M600M800M1000MSE +/- 120185.04, N = 3SE +/- 153441.99, N = 310963666677950333331. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 64 - Buffer Length: 256 - Filter Length: 32

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 32Clang 17GCC 13900M1800M2700M3600M4500MSE +/- 37196788.99, N = 3SE +/- 18636374.23, N = 3403283333326715333331. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 64 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 57Clang 17GCC 13500M1000M1500M2000M2500MSE +/- 4628534.69, N = 3SE +/- 57735.03, N = 3218270000015879000001. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 72 - Buffer Length: 256 - Filter Length: 32

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 72 - Buffer Length: 256 - Filter Length: 32Clang 17GCC 13900M1800M2700M3600M4500MSE +/- 26602506.15, N = 3SE +/- 13903516.74, N = 3438600000029523333331. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 72 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 72 - Buffer Length: 256 - Filter Length: 57Clang 17GCC 13500M1000M1500M2000M2500MSE +/- 8154140.05, N = 3SE +/- 5356304.70, N = 3240640000017678000001. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 32 - Buffer Length: 256 - Filter Length: 512

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 32 - Buffer Length: 256 - Filter Length: 512Clang 17GCC 1320M40M60M80M100MSE +/- 196751.39, N = 3SE +/- 129816.54, N = 3102853333959310001. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 64 - Buffer Length: 256 - Filter Length: 512

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 512Clang 17GCC 1340M80M120M160M200MSE +/- 37859.39, N = 3SE +/- 67659.28, N = 32062700001918866671. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 72 - Buffer Length: 256 - Filter Length: 512

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 72 - Buffer Length: 256 - Filter Length: 512Clang 17GCC 1350M100M150M200M250MSE +/- 91651.51, N = 3SE +/- 79372.54, N = 32313700002155000001. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid

Stress-NG

Test: CPU Cache

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: CPU CacheGCC 13Clang 17200K400K600K800K1000KSE +/- 34515.94, N = 15SE +/- 44765.07, N = 15949580.78932492.75

Stress-NG

Test: Matrix Math

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Matrix MathClang 17GCC 13120K240K360K480K600KSE +/- 356.30, N = 3SE +/- 2880.52, N = 3550915.68515044.15

Stress-NG

Test: Vector Math

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Vector MathClang 17GCC 13100K200K300K400K500KSE +/- 100.06, N = 3SE +/- 25.00, N = 3450187.94387369.41

Stress-NG

Test: Floating Point

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Floating PointGCC 13Clang 174K8K12K16K20KSE +/- 0.79, N = 3SE +/- 4.32, N = 319830.0419566.28

Stress-NG

Test: Vector Shuffle

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Vector ShuffleGCC 1315K30K45K60K75KSE +/- 173.05, N = 371014.331. (CXX) g++ options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -std=gnu99 -U_FORTIFY_SOURCE -O2 -lc

Stress-NG

Test: Fused Multiply-Add

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Fused Multiply-AddGCC 13Clang 1730M60M90M120M150MSE +/- 1745267.88, N = 3SE +/- 1087775.35, N = 15161511818.72157339813.59

Stress-NG

Test: Vector Floating Point

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.16.04Test: Vector Floating PointClang 17GCC 1330K60K90K120K150KSE +/- 38.13, N = 3SE +/- 275.28, N = 3141522.2483730.08


Phoronix Test Suite v10.8.4