Clang and GCC benchmarks by Michael Larabel for a future article. ARMv8 Neoverse-V2 testing with a Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS) and ASPEED on Ubuntu 23.10 via the Phoronix Test Suite.
GCC 13 Processor: ARMv8 Neoverse-V2 @ 3.39GHz (72 Cores), Motherboard: Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS), Memory: 1 x 480GB DRAM-6400MT/s, Disk: 960GB SAMSUNG MZ1L2960HCJR-00A07 + 1920GB SAMSUNG MZTL21T9, Graphics: ASPEED, Network: 2 x Mellanox MT2910 + 2 x QLogic FastLinQ QL41000 10/25/40/50GbE
OS: Ubuntu 23.10, Kernel: 6.8.0-060800rc3daily20240208-generic-64k (aarch64), Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1920x1200
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -mtune=neoverse-v2 -mcpu=neoverse-v2" CFLAGS="-O3 -mtune=neoverse-v2 -mcpu=neoverse-v2"Compiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -vProcessor Notes: Scaling Governor: cppc_cpufreq performance (Boost: Disabled)Python Notes: Python 3.11.6Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Clang 17 OS: Ubuntu 23.10, Kernel: 6.8.0-060800rc3daily20240208-generic-64k (aarch64), Compiler: Clang 17.0.2, File-System: ext4, Screen Resolution: 1920x1200
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -mtune=neoverse-v2 -mcpu=neoverse-v2" CFLAGS="-O3 -mtune=neoverse-v2 -mcpu=neoverse-v2"Processor Notes: Scaling Governor: cppc_cpufreq performance (Boost: Disabled)Python Notes: Python 3.11.6Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
NVIDIA GH200 Compilers OpenBenchmarking.org Phoronix Test Suite ARMv8 Neoverse-V2 @ 3.39GHz (72 Cores) Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS) 1 x 480GB DRAM-6400MT/s 960GB SAMSUNG MZ1L2960HCJR-00A07 + 1920GB SAMSUNG MZTL21T9 ASPEED 2 x Mellanox MT2910 + 2 x QLogic FastLinQ QL41000 10/25/40/50GbE Ubuntu 23.10 6.8.0-060800rc3daily20240208-generic-64k (aarch64) GCC 13.2.0 Clang 17.0.2 ext4 1920x1200 Processor Motherboard Memory Disk Graphics Network OS Kernel Compilers File-System Screen Resolution NVIDIA GH200 Compilers Performance System Logs - Transparent Huge Pages: madvise - CXXFLAGS="-O3 -mtune=neoverse-v2 -mcpu=neoverse-v2" CFLAGS="-O3 -mtune=neoverse-v2 -mcpu=neoverse-v2" - GCC 13: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v - Scaling Governor: cppc_cpufreq performance (Boost: Disabled) - Python 3.11.6 - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
GCC 13 vs. Clang 17 Comparison Phoronix Test Suite Baseline +24.9% +24.9% +49.8% +49.8% +74.7% +74.7% +99.6% +99.6% 99.7% 69% 51.6% 51.4% 51% 48.6% 37.9% 37.8% 37.5% 36.1% 30% 30% 26.9% 26.9% 20.7% 16.2% 10.5% 8.2% 8.2% 7.9% 7.7% 7.5% 7.5% 7.4% 7.2% 7% 6.9% 5.1% 4.9% 3.2% 2.7% 2.3% 2.1% Sharpen V.F.P 32 - 256 - 32 1 - 256 - 32 64 - 256 - 32 72 - 256 - 32 Enhanced 40.7% 32 - 256 - 57 1 - 256 - 57 64 - 256 - 57 72 - 256 - 57 Noise-Gaussian 33.3% OpenMP - BM1 OpenMP - BM1 OpenMP - BM2 OpenMP - BM2 14 digit 23.8% Q.1.H.C 19 - D.S 20% 2 18.3% 19, Long Mode - D.S 17.3% Vector Math WAV To MP3 14.9% Total Time - 4.1.R.P.P 12.4% 12% Default HWB Color Space 8.5% Single-Threaded A.C.P Quality 100 Q.1.L.H.C 64 - 256 - 512 Multi-Threaded 72 - 256 - 512 32 - 256 - 512 Matrix Math 1 - 256 - 512 WAV To Opus Encode WAV To FLAC Rotate 20k Atoms Fused Multiply-Add 2.7% Q.1.L Swirl GraphicsMagick Stress-NG Liquid-DSP Liquid-DSP Liquid-DSP Liquid-DSP GraphicsMagick Liquid-DSP Liquid-DSP Liquid-DSP Liquid-DSP GraphicsMagick miniBUDE miniBUDE miniBUDE miniBUDE Helsing WebP Image Encode Zstd Compression libavif avifenc Zstd Compression Stress-NG LAME MP3 Encoding C-Ray libavif avifenc WebP Image Encode GraphicsMagick QuantLib TSCP WebP Image Encode WebP Image Encode Liquid-DSP QuantLib Liquid-DSP Liquid-DSP Stress-NG Liquid-DSP Opus Codec Encoding FLAC Audio Encoding GraphicsMagick LAMMPS Molecular Dynamics Simulator Stress-NG WebP Image Encode GraphicsMagick GCC 13 Clang 17
NVIDIA GH200 Compilers minibude: OpenMP - BM1 minibude: OpenMP - BM2 stress-ng: CPU Cache stress-ng: Matrix Math stress-ng: Vector Math stress-ng: Floating Point stress-ng: Vector Shuffle stress-ng: Fused Multiply-Add stress-ng: Vector Floating Point minibude: OpenMP - BM1 minibude: OpenMP - BM2 graphics-magick: Swirl graphics-magick: Rotate graphics-magick: Sharpen graphics-magick: Enhanced graphics-magick: Resizing graphics-magick: Noise-Gaussian graphics-magick: HWB Color Space securemark: SecureMark-TLS compress-zstd: 19 - Compression Speed compress-zstd: 19 - Decompression Speed compress-zstd: 19, Long Mode - Compression Speed compress-zstd: 19, Long Mode - Decompression Speed quantlib: Multi-Threaded quantlib: Single-Threaded webp: Default webp: Quality 100 webp: Quality 100, Lossless webp: Quality 100, Highest Compression webp: Quality 100, Lossless, Highest Compression tscp: AI Chess Performance lammps: 20k Atoms lammps: Rhodopsin Protein liquid-dsp: 1 - 256 - 32 liquid-dsp: 1 - 256 - 57 liquid-dsp: 1 - 256 - 512 liquid-dsp: 32 - 256 - 32 liquid-dsp: 32 - 256 - 57 liquid-dsp: 64 - 256 - 32 liquid-dsp: 64 - 256 - 57 liquid-dsp: 72 - 256 - 32 liquid-dsp: 72 - 256 - 57 liquid-dsp: 32 - 256 - 512 liquid-dsp: 64 - 256 - 512 liquid-dsp: 72 - 256 - 512 lulesh: avifenc: 0 avifenc: 2 avifenc: 6, Lossless avifenc: 10, Lossless c-ray: Total Time - 4K, 16 Rays Per Pixel primesieve: 1e12 primesieve: 1e13 encode-flac: WAV To FLAC encode-mp3: WAV To MP3 encode-opus: WAV To Opus Encode helsing: 14 digit GCC 13 Clang 17 47.755 48.041 949580.78 515044.15 387369.41 19830.04 71014.33 161511818.72 83730.08 1193.878 1201.027 3672 1764 882 2170 8044 1920 4731 265718 14.7 1237.4 8.57 1283.6 232068.2 3456.0 13.95 9.44 1.28 3.86 0.52 2078407 48.233 55.360 45523000 26470667 3194567 1362833333 795033333 2671533333 1587900000 2952333333 1767800000 95931000 191886667 215500000 48090.643 109.674 67.073 3.754 2.851 6.006 2.891 35.191 16.872 5.474 33.037 68.102 62.075 60.979 932492.75 550915.68 450187.94 19566.28 157339813.59 141522.24 1551.880 1524.466 3748 1820 1761 1542 7919 1440 4360 267498 14.9 1031.1 8.70 1094.4 249451.3 3740.4 15.42 10.19 1.31 4.66 0.56 2248073 49.521 56.335 68921000 36488333 3414867 2066233333 1096366667 4032833333 2182700000 4386000000 2406400000 102853333 206270000 231370000 47590.091 122.871 79.371 3.717 2.871 6.749 2.929 35.597 16.078 6.287 31.420 84.332 OpenBenchmarking.org
miniBUDE MiniBUDE is a mini application for the the core computation of the Bristol University Docking Engine (BUDE). This test profile currently makes use of the OpenMP implementation of miniBUDE for CPU benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Billion Interactions/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM1 Clang 17 GCC 13 14 28 42 56 70 SE +/- 0.06, N = 3 SE +/- 0.10, N = 3 62.08 47.76 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -mcpu=native -lm
OpenBenchmarking.org Billion Interactions/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM2 Clang 17 GCC 13 14 28 42 56 70 SE +/- 0.68, N = 3 SE +/- 0.09, N = 3 60.98 48.04 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -mcpu=native -lm
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Vector Shuffle GCC 13 15K 30K 45K 60K 75K SE +/- 173.05, N = 3 71014.33 1. (CXX) g++ options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -std=gnu99 -U_FORTIFY_SOURCE -O2 -lc
Test: Vector Shuffle
Clang 17: The test run did not produce a result.
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Fused Multiply-Add Clang 17 GCC 13 30M 60M 90M 120M 150M SE +/- 1087775.35, N = 15 SE +/- 1745267.88, N = 3 157339813.59 161511818.72
miniBUDE MiniBUDE is a mini application for the the core computation of the Bristol University Docking Engine (BUDE). This test profile currently makes use of the OpenMP implementation of miniBUDE for CPU benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFInst/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM1 Clang 17 GCC 13 300 600 900 1200 1500 SE +/- 1.53, N = 3 SE +/- 2.51, N = 3 1551.88 1193.88 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -mcpu=native -lm
OpenBenchmarking.org GFInst/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM2 Clang 17 GCC 13 300 600 900 1200 1500 SE +/- 16.91, N = 3 SE +/- 2.12, N = 3 1524.47 1201.03 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -mcpu=native -lm
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: Swirl Clang 17 GCC 13 800 1600 2400 3200 4000 SE +/- 48.77, N = 3 SE +/- 5.24, N = 3 3748 3672 1. (CC) gcc options: -fopenmp -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: Rotate Clang 17 GCC 13 400 800 1200 1600 2000 SE +/- 25.64, N = 3 SE +/- 9.35, N = 3 1820 1764 1. (CC) gcc options: -fopenmp -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: Sharpen Clang 17 GCC 13 400 800 1200 1600 2000 SE +/- 11.15, N = 15 SE +/- 1.86, N = 3 1761 882 1. (CC) gcc options: -fopenmp -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: Enhanced Clang 17 GCC 13 500 1000 1500 2000 2500 SE +/- 14.05, N = 3 SE +/- 11.57, N = 3 1542 2170 1. (CC) gcc options: -fopenmp -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: Resizing Clang 17 GCC 13 2K 4K 6K 8K 10K SE +/- 70.92, N = 3 SE +/- 39.89, N = 3 7919 8044 1. (CC) gcc options: -fopenmp -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: Noise-Gaussian Clang 17 GCC 13 400 800 1200 1600 2000 SE +/- 4.67, N = 3 SE +/- 0.67, N = 3 1440 1920 1. (CC) gcc options: -fopenmp -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: HWB Color Space Clang 17 GCC 13 1000 2000 3000 4000 5000 SE +/- 16.18, N = 3 SE +/- 41.68, N = 3 4360 4731 1. (CC) gcc options: -fopenmp -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lzstd -lm -lpthread
SecureMark SecureMark is an objective, standardized benchmarking framework for measuring the efficiency of cryptographic processing solutions developed by EEMBC. SecureMark-TLS is benchmarking Transport Layer Security performance with a focus on IoT/edge computing. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org marks, More Is Better SecureMark 1.0.4 Benchmark: SecureMark-TLS Clang 17 GCC 13 60K 120K 180K 240K 300K SE +/- 54.60, N = 3 SE +/- 525.41, N = 3 267498 265718 1. (CC) gcc options: -pedantic -O3
Zstd Compression This test measures the time needed to compress/decompress a sample file (silesia.tar) using Zstd (Zstandard) compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Compression Speed Clang 17 GCC 13 4 8 12 16 20 SE +/- 0.17, N = 3 SE +/- 0.12, N = 3 14.9 14.7 -Qunused-arguments 1. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Decompression Speed Clang 17 GCC 13 300 600 900 1200 1500 SE +/- 4.91, N = 3 SE +/- 13.01, N = 3 1031.1 1237.4 -Qunused-arguments 1. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Compression Speed Clang 17 GCC 13 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 8.70 8.57 -Qunused-arguments 1. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Decompression Speed Clang 17 GCC 13 300 600 900 1200 1500 SE +/- 2.48, N = 3 SE +/- 3.15, N = 3 1094.4 1283.6 -Qunused-arguments 1. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lz -llzma
QuantLib QuantLib is an open-source library/framework around quantitative finance for modeling, trading and risk management scenarios. QuantLib is written in C++ with Boost and its built-in benchmark used reports the QuantLib Benchmark Index benchmark score. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.32 Configuration: Multi-Threaded Clang 17 GCC 13 50K 100K 150K 200K 250K SE +/- 2974.17, N = 3 SE +/- 3083.44, N = 3 249451.3 232068.2 1. (CXX) g++ options: -O3 -march=native -fPIE -pie
OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.32 Configuration: Single-Threaded Clang 17 GCC 13 800 1600 2400 3200 4000 SE +/- 37.89, N = 6 SE +/- 35.85, N = 3 3740.4 3456.0 1. (CXX) g++ options: -O3 -march=native -fPIE -pie
OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.2.4 Encode Settings: Quality 100 Clang 17 GCC 13 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 10.19 9.44 1. (CC) gcc options: -fvisibility=hidden -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm
OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.2.4 Encode Settings: Quality 100, Lossless Clang 17 GCC 13 0.2948 0.5896 0.8844 1.1792 1.474 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.31 1.28 1. (CC) gcc options: -fvisibility=hidden -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm
OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.2.4 Encode Settings: Quality 100, Highest Compression Clang 17 GCC 13 1.0485 2.097 3.1455 4.194 5.2425 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 4.66 3.86 1. (CC) gcc options: -fvisibility=hidden -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm
OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.2.4 Encode Settings: Quality 100, Lossless, Highest Compression Clang 17 GCC 13 0.126 0.252 0.378 0.504 0.63 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.56 0.52 1. (CC) gcc options: -fvisibility=hidden -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm
TSCP This is a performance test of TSCP, Tom Kerrigan's Simple Chess Program, which has a built-in performance benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Nodes Per Second, More Is Better TSCP 1.81 AI Chess Performance Clang 17 GCC 13 500K 1000K 1500K 2000K 2500K SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 2248073 2078407 1. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -march=native
Liquid-DSP LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 1 - Buffer Length: 256 - Filter Length: 32 Clang 17 GCC 13 15M 30M 45M 60M 75M SE +/- 6082.76, N = 3 SE +/- 30138.57, N = 3 68921000 45523000 1. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 1 - Buffer Length: 256 - Filter Length: 57 Clang 17 GCC 13 8M 16M 24M 32M 40M SE +/- 1666.67, N = 3 SE +/- 14240.01, N = 3 36488333 26470667 1. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 1 - Buffer Length: 256 - Filter Length: 512 Clang 17 GCC 13 700K 1400K 2100K 2800K 3500K SE +/- 2635.86, N = 3 SE +/- 1117.04, N = 3 3414867 3194567 1. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 32 - Buffer Length: 256 - Filter Length: 32 Clang 17 GCC 13 400M 800M 1200M 1600M 2000M SE +/- 993870.10, N = 3 SE +/- 1337078.07, N = 3 2066233333 1362833333 1. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 32 - Buffer Length: 256 - Filter Length: 57 Clang 17 GCC 13 200M 400M 600M 800M 1000M SE +/- 120185.04, N = 3 SE +/- 153441.99, N = 3 1096366667 795033333 1. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 32 Clang 17 GCC 13 900M 1800M 2700M 3600M 4500M SE +/- 37196788.99, N = 3 SE +/- 18636374.23, N = 3 4032833333 2671533333 1. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 57 Clang 17 GCC 13 500M 1000M 1500M 2000M 2500M SE +/- 4628534.69, N = 3 SE +/- 57735.03, N = 3 2182700000 1587900000 1. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 72 - Buffer Length: 256 - Filter Length: 32 Clang 17 GCC 13 900M 1800M 2700M 3600M 4500M SE +/- 26602506.15, N = 3 SE +/- 13903516.74, N = 3 4386000000 2952333333 1. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 72 - Buffer Length: 256 - Filter Length: 57 Clang 17 GCC 13 500M 1000M 1500M 2000M 2500M SE +/- 8154140.05, N = 3 SE +/- 5356304.70, N = 3 2406400000 1767800000 1. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 32 - Buffer Length: 256 - Filter Length: 512 Clang 17 GCC 13 20M 40M 60M 80M 100M SE +/- 196751.39, N = 3 SE +/- 129816.54, N = 3 102853333 95931000 1. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 512 Clang 17 GCC 13 40M 80M 120M 160M 200M SE +/- 37859.39, N = 3 SE +/- 67659.28, N = 3 206270000 191886667 1. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 72 - Buffer Length: 256 - Filter Length: 512 Clang 17 GCC 13 50M 100M 150M 200M 250M SE +/- 91651.51, N = 3 SE +/- 79372.54, N = 3 231370000 215500000 1. (CC) gcc options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -pthread -lm -lc -lliquid
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 1.0 Encoder Speed: 2 Clang 17 GCC 13 20 40 60 80 100 SE +/- 0.27, N = 3 SE +/- 0.03, N = 3 79.37 67.07 1. (CXX) g++ options: -O3 -fPIC -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 1.0 Encoder Speed: 6, Lossless Clang 17 GCC 13 0.8447 1.6894 2.5341 3.3788 4.2235 SE +/- 0.008, N = 3 SE +/- 0.007, N = 3 3.717 3.754 1. (CXX) g++ options: -O3 -fPIC -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 1.0 Encoder Speed: 10, Lossless Clang 17 GCC 13 0.646 1.292 1.938 2.584 3.23 SE +/- 0.003, N = 3 SE +/- 0.006, N = 3 2.871 2.851 1. (CXX) g++ options: -O3 -fPIC -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm
C-Ray This is a test of C-Ray, a simple raytracer designed to test the floating-point CPU performance. This test is multi-threaded (16 threads per core), will shoot 8 rays per pixel for anti-aliasing, and will generate a 1600 x 1200 image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel Clang 17 GCC 13 2 4 6 8 10 SE +/- 0.009, N = 3 SE +/- 0.003, N = 3 6.749 6.006 1. (CC) gcc options: -lm -lpthread -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2
Primesieve Primesieve generates prime numbers using a highly optimized sieve of Eratosthenes implementation. Primesieve primarily benchmarks the CPU's L1/L2 cache performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 8.0 Length: 1e12 Clang 17 GCC 13 0.659 1.318 1.977 2.636 3.295 SE +/- 0.001, N = 3 SE +/- 0.002, N = 3 2.929 2.891 1. (CXX) g++ options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2
OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 8.0 Length: 1e13 Clang 17 GCC 13 8 16 24 32 40 SE +/- 0.34, N = 3 SE +/- 0.42, N = 3 35.60 35.19 1. (CXX) g++ options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2
LAME MP3 Encoding LAME is an MP3 encoder licensed under the LGPL. This test measures the time required to encode a WAV file to MP3 format. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better LAME MP3 Encoding 3.100 WAV To MP3 Clang 17 GCC 13 2 4 6 8 10 SE +/- 0.001, N = 3 SE +/- 0.003, N = 3 6.287 5.474 -lncurses -ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr 1. (CC) gcc options: -O3 -pipe -mtune=neoverse-v2 -mcpu=neoverse-v2 -lm
Opus Codec Encoding Opus is an open audio codec. Opus is a lossy audio compression format designed primarily for interactive real-time applications over the Internet. This test uses Opus-Tools and measures the time required to encode a WAV file to Opus five times. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.4 WAV To Opus Encode Clang 17 GCC 13 8 16 24 32 40 SE +/- 0.01, N = 5 SE +/- 0.02, N = 5 31.42 33.04 1. (CXX) g++ options: -O3 -mtune=neoverse-v2 -mcpu=neoverse-v2 -fvisibility=hidden -logg -lm
GCC 13 Processor: ARMv8 Neoverse-V2 @ 3.39GHz (72 Cores), Motherboard: Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS), Memory: 1 x 480GB DRAM-6400MT/s, Disk: 960GB SAMSUNG MZ1L2960HCJR-00A07 + 1920GB SAMSUNG MZTL21T9, Graphics: ASPEED, Network: 2 x Mellanox MT2910 + 2 x QLogic FastLinQ QL41000 10/25/40/50GbE
OS: Ubuntu 23.10, Kernel: 6.8.0-060800rc3daily20240208-generic-64k (aarch64), Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1920x1200
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -mtune=neoverse-v2 -mcpu=neoverse-v2" CFLAGS="-O3 -mtune=neoverse-v2 -mcpu=neoverse-v2"Compiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -vProcessor Notes: Scaling Governor: cppc_cpufreq performance (Boost: Disabled)Python Notes: Python 3.11.6Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 9 February 2024 15:23 by user x.
Clang 17 Processor: ARMv8 Neoverse-V2 @ 3.39GHz (72 Cores), Motherboard: Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS), Memory: 1 x 480GB DRAM-6400MT/s, Disk: 960GB SAMSUNG MZ1L2960HCJR-00A07 + 1920GB SAMSUNG MZTL21T9, Graphics: ASPEED, Network: 2 x Mellanox MT2910 + 2 x QLogic FastLinQ QL41000 10/25/40/50GbE
OS: Ubuntu 23.10, Kernel: 6.8.0-060800rc3daily20240208-generic-64k (aarch64), Compiler: Clang 17.0.2, File-System: ext4, Screen Resolution: 1920x1200
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -mtune=neoverse-v2 -mcpu=neoverse-v2" CFLAGS="-O3 -mtune=neoverse-v2 -mcpu=neoverse-v2"Processor Notes: Scaling Governor: cppc_cpufreq performance (Boost: Disabled)Python Notes: Python 3.11.6Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 9 February 2024 19:05 by user x.