Intel Xeon 6900P - SNC vs. HEX Clustering Mode

Benchmarks by Michael Larabel for a future article..

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2409257-NE-INTELGNRH28
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results
Show Result Confidence Charts

Limit displaying results to tests within:

BLAS (Basic Linear Algebra Sub-Routine) Tests 2 Tests
Timed Code Compilation 2 Tests
C/C++ Compiler Tests 7 Tests
CPU Massive 12 Tests
Creator Workloads 2 Tests
Database Test Suite 2 Tests
Fortran Tests 6 Tests
HPC - High Performance Computing 17 Tests
Java Tests 2 Tests
Linear Algebra 2 Tests
Molecular Dynamics 5 Tests
MPI Benchmarks 6 Tests
Multi-Core 14 Tests
NVIDIA GPU Compute 2 Tests
OpenMPI Tests 12 Tests
Programmer / Developer System Benchmarks 4 Tests
Python Tests 5 Tests
Scientific Computing 8 Tests
Server 2 Tests
Server CPU Tests 8 Tests

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Geometric Means Per-Suite/Category
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
HEX Mode
September 24
  11 Hours, 58 Minutes
SNC3 - Default
September 24
  10 Hours, 59 Minutes
Invert Hiding All Results Option
  11 Hours, 29 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


Intel Xeon 6900P - SNC vs. HEX Clustering ModeOpenBenchmarking.orgPhoronix Test Suite2 x Intel Xeon 6980P @ 3.90GHz (256 Cores / 512 Threads)Intel BIRCHSTREAM (BHSDCRB1.IPC.0035.D44.2408292336 BIOS)Intel Ice Lake IEH1520GB960GB SAMSUNG MZ1L2960HCJR-00A07ASPEEDIntel I210 + 2 x Intel 10-Gigabit X540-AT2Ubuntu 24.046.8.0-45-generic (x86_64)GCC 13.2.0ext41920x1200ProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelCompilerFile-SystemScreen ResolutionIntel Xeon 6900P - SNC Vs. HEX Clustering Mode BenchmarksSystem Logs- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: intel_pstate performance (EPP: performance) - CPU Microcode: 0x10002f0- OpenJDK Runtime Environment (build 21.0.4+7-Ubuntu-1ubuntu224.04)- Python 3.12.3- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: BHI_DIS_S + srbds: Not affected + tsx_async_abort: Not affected

HEX Mode vs. SNC3 - Default ComparisonPhoronix Test SuiteBaseline+13.9%+13.9%+27.8%+27.8%+41.7%+41.7%+55.6%+55.6%55.7%52.9%47.2%34.8%31.2%29.5%22.6%15.1%14.7%14.3%12.5%11.6%10.8%10.3%10%9.9%9.1%8.8%8.6%8.3%8.1%7.2%6.4%5.7%5.2%5.1%5%4.9%4.7%4.3%4.1%4%3.8%3.1%3%2.8%2.7%2.3%2%256H2 Database EngineallmodconfigWrites46%tConvolve MPI - GriddingStreams32.3%tConvolve MPI - Degridding26S.w.1.0.6.A29.4%CPU - 512 - ResNet-5023.4%NinjaA.w.3.5.A18.8%100 - 1000 - Read Only17.7%defconfig2626e.G.B.S - 1200100 - 1000 - Read Only - Average Latency12%B.S.o.WCPU - Numpy - 4194304 - Equation of StateOpenMP - Euclidean Cluster10.4%i.i.1.C.P.DBasic - CPU32S.F.P.R9%Unix Makefilesi.i.1.C.P.D26Chrysler Neon 1MC240 Buckyball160 160 160 - 60144 144 144 - 60OpenMP - BM25.7%OpenMP - BM25.7%B.B.D.FR.O.R.S.IOpenMP - NDT Mapping64Preset 8 - Bosphorus 4K4.7%X.b.i.i104 104 104 - 60128C.P.D.TApache Xalan XSLT3.8%Chess BenchmarkPreset 13 - Beauty 4K 10-bit3.3%Apache Tomcat3.2%OpenMP - Points2ImageLayered HalfspaceI.a.F.S.I.D.CBMW27 - CPU-OnlyPreset 8 - Beauty 4K 10-bit2.5%Carbon NanotubeTomographic ModellibxsmmDaCapo BenchmarkTimed Linux Kernel CompilationApache CassandraASKAPPETScASKAPGraph500NAMDTensorFlowTimed LLVM CompilationNAMDPostgreSQLTimed Linux Kernel CompilationGraph500Graph500easyWavePostgreSQLOpenRadiossPyHPC BenchmarksDarmstadt Automotive Parallel Heterogeneous SuiteAlgebraic Multi-Grid BenchmarkXcompact3d Incompact3dRELIONlibxsmmACES DGEMMTimed LLVM CompilationXcompact3d Incompact3dGraph500OpenRadiossNWChemHigh Performance Conjugate GradientHigh Performance Conjugate GradientminiBUDEminiBUDEDaCapo BenchmarkOpenRadiossDarmstadt Automotive Parallel Heterogeneous SuitelibxsmmSVT-AV1Xcompact3d Incompact3dHigh Performance Conjugate GradientlibxsmmOpenRadiossDaCapo BenchmarkStockfishSVT-AV1DaCapo BenchmarkDarmstadt Automotive Parallel Heterogeneous SuiteSPECFEM3DOpenRadiossBlenderSVT-AV1GPAWSPECFEM3DHEX ModeSNC3 - Default

Intel Xeon 6900P - SNC vs. HEX Clustering Modebuild-linux-kernel: allmodconfigcassandra: Writesaskap: tConvolve MPI - Griddingpetsc: Streamsaskap: tConvolve MPI - Degriddinggraph500: 26tensorflow: CPU - 512 - ResNet-50build-llvm: Ninjanamd: ATPase with 327,506 Atomsbuild-linux-kernel: defconfiggraph500: 26graph500: 26easywave: e2Asean Grid + BengkuluSept2007 Source - 1200openradioss: Bird Strike on Windshieldpyhpc: CPU - Numpy - 4194304 - Equation of Stateamg: incompact3d: input.i3d 193 Cells Per Directionrelion: Basic - CPUmt-dgemm: Sustained Floating-Point Ratebuild-llvm: Unix Makefilesincompact3d: input.i3d 129 Cells Per Directiongraph500: 26openradioss: Chrysler Neon 1Mnwchem: C240 Buckyballhpcg: 160 160 160 - 60hpcg: 144 144 144 - 60dacapobench: BioJava Biological Data Frameworkdaphne: OpenMP - NDT Mappinglibxsmm: 64svt-av1: Preset 8 - Bosphorus 4Kincompact3d: X3D-benchmarking input.i3dhpcg: 104 104 104 - 60libxsmm: 128openradioss: Cell Phone Drop Testsvt-av1: Preset 13 - Beauty 4K 10-bitdaphne: OpenMP - Points2Imagespecfem3d: Layered Halfspaceopenradioss: INIVOL and Fluid Structure Interaction Drop Containerblender: BMW27 - CPU-Onlysvt-av1: Preset 8 - Beauty 4K 10-bitgpaw: Carbon Nanotubespecfem3d: Tomographic Modeldacapobench: Apache Kafkasvt-av1: Preset 5 - Beauty 4K 10-bitcompress-7zip: Compression Ratingspecfem3d: Homogeneous Halfspacebyte: System Callgromacs: MPI CPU - water_GMX50_barelammps: 20k Atomspyhpc: CPU - Numpy - 4194304 - Isoneutral Mixingcompress-7zip: Decompression Ratingblender: Pabellon Barcelona - CPU-Onlypgbench: 100 - 1000 - Read Write - Average Latencypgbench: 100 - 1000 - Read Writesvt-av1: Preset 5 - Bosphorus 4Keasywave: e2Asean Grid + BengkuluSept2007 Source - 2400specfem3d: Water-layered Halfspacebyte: Dhrystone 2blender: Barbershop - CPU-Onlyspecfem3d: Mount St. Helensblender: Fishy Cat - CPU-Onlysvt-av1: Preset 3 - Bosphorus 4Kblender: Junkshop - CPU-Onlyblender: Classroom - CPU-Onlydacapobench: Jythonbyte: Whetstone Doubleopenradioss: Bumper Beampgbench: 100 - 1000 - Read Only - Average Latencypgbench: 100 - 1000 - Read Onlydaphne: OpenMP - Euclidean Clusterstockfish: Chess Benchmarksvt-av1: Preset 13 - Bosphorus 4Kdacapobench: H2 Database Enginedacapobench: Apache Xalan XSLTdacapobench: Apache Tomcatlammps: Rhodopsin Proteinopenradioss: Rubber O-Ring Seal Installationlibxsmm: 32libxsmm: 256namd: STMV with 1,066,628 Atomsminibude: OpenMP - BM2minibude: OpenMP - BM2HEX ModeSNC3 - Default193.11414712592985.2595124.231682064.2673326000214.7394.0614.4997026.983964107000153033000059.596172.451.71377083950002.82190266114.3973865.658800214.4390.903198322195467000066.461779.7159.835161.9165791547.435554.066.99272.1088104168.7847862.240.7313.4104243.697.42268851893.137.698.19188.8664.44864455259955.6849117745.513972044910572120.132.13792.7521.931137797921.5473.2261365631.193140.4389.32128229118804694499.068.404.01897477310.499.10610.8917.5135683728535.6110.681.645638580676.38566141415195.989171782500893370.474231.103214.12706.02.55596283.7577093.931131.149100742125342449771.2850107661871769000173.9976.7303.7875523.4431105950000174879000052.965154.561.54685042013332.56438231104.0513545.814232197.1290.831954996211624000061.481660.6170.002171.2055503574.845826.563.96768.8912379176.0098182.939.1812.9794375.317.20756925990.637.497.99186.8774.36123485660985.7738977255.433174239897496230.932.59294.0521.907139494721.2874.0431350630.856141.9059.24385098118658833934.368.003.99571036610.439.15510.8517.5535623722311.4110.551.843542652612.85587474317193.486112382595922371.402219.853507.44213.91.97540268.5306713.241OpenBenchmarking.org

Timed Linux Kernel Compilation

This test times how long it takes to build the Linux kernel in a default configuration (defconfig) for the architecture being tested or alternatively an allmodconfig for building all possible kernel modules for the build. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 6.8Build: allmodconfigHEX ModeSNC3 - Default4080120160200SE +/- 0.27, N = 3SE +/- 1.64, N = 3193.11131.15

Apache Cassandra

OpenBenchmarking.orgOp/s, More Is BetterApache Cassandra 5.0Test: WritesHEX ModeSNC3 - Default30K60K90K120K150KSE +/- 1522.45, N = 3SE +/- 1022.25, N = 3147125100742

ASKAP

ASKAP is a set of benchmarks from the Australian SKA Pathfinder. The principal ASKAP benchmarks are the Hogbom Clean Benchmark (tHogbomClean) and Convolutional Resamping Benchmark (tConvolve) as well as some previous ASKAP benchmarks being included as well for OpenCL and CUDA execution of tConvolve. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMpix/sec, More Is BetterASKAP 1.0Test: tConvolve MPI - GriddingHEX ModeSNC3 - Default30K60K90K120K150KSE +/- 1249.70, N = 3SE +/- 1080.26, N = 392985.2125342.01. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

PETSc

PETSc, the Portable, Extensible Toolkit for Scientific Computation, is for the scalable (parallel) solution of scientific applications modeled by partial differential equations. This test profile runs the PETSc "make streams" benchmark and records the throughput rate when all available cores are utilized for the MPI Streams build. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMB/s, More Is BetterPETSc 3.19Test: StreamsHEX ModeSNC3 - Default130K260K390K520K650KSE +/- 1960.36, N = 3SE +/- 5879.52, N = 4595124.23449771.291. (CC) gcc options: -fPIC -O3 -O2 -lpthread -lm

ASKAP

ASKAP is a set of benchmarks from the Australian SKA Pathfinder. The principal ASKAP benchmarks are the Hogbom Clean Benchmark (tHogbomClean) and Convolutional Resamping Benchmark (tConvolve) as well as some previous ASKAP benchmarks being included as well for OpenCL and CUDA execution of tConvolve. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMpix/sec, More Is BetterASKAP 1.0Test: tConvolve MPI - DegriddingHEX ModeSNC3 - Default20K40K60K80K100KSE +/- 704.03, N = 3SE +/- 797.06, N = 382064.2107661.01. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

Graph500

This is a benchmark of the reference implementation of Graph500, an HPC benchmark focused on data intensive loads and commonly tested on supercomputers for complex data problems. Graph500 primarily stresses the communication subsystem of the hardware under test. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgsssp median_TEPS, More Is BetterGraph500 3.0Scale: 26HEX ModeSNC3 - Default200M400M600M800M1000M6733260008717690001. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 512 - Model: ResNet-50HEX ModeSNC3 - Default50100150200250SE +/- 2.28, N = 4SE +/- 1.54, N = 3214.73173.99

Timed LLVM Compilation

This test times how long it takes to compile/build the LLVM compiler stack. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 16.0Build System: NinjaHEX ModeSNC3 - Default20406080100SE +/- 0.68, N = 15SE +/- 0.86, N = 594.0676.73

NAMD

OpenBenchmarking.orgns/day, More Is BetterNAMD 3.0Input: ATPase with 327,506 AtomsHEX ModeSNC3 - Default1.01242.02483.03724.04965.062SE +/- 0.05577, N = 15SE +/- 0.01928, N = 34.499703.78755

Timed Linux Kernel Compilation

This test times how long it takes to build the Linux kernel in a default configuration (defconfig) for the architecture being tested or alternatively an allmodconfig for building all possible kernel modules for the build. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 6.8Build: defconfigHEX ModeSNC3 - Default612182430SE +/- 0.23, N = 8SE +/- 0.20, N = 826.9823.44

Graph500

This is a benchmark of the reference implementation of Graph500, an HPC benchmark focused on data intensive loads and commonly tested on supercomputers for complex data problems. Graph500 primarily stresses the communication subsystem of the hardware under test. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgsssp max_TEPS, More Is BetterGraph500 3.0Scale: 26HEX ModeSNC3 - Default200M400M600M800M1000M96410700011059500001. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi

OpenBenchmarking.orgbfs median_TEPS, More Is BetterGraph500 3.0Scale: 26HEX ModeSNC3 - Default400M800M1200M1600M2000M153033000017487900001. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi

easyWave

The easyWave software allows simulating tsunami generation and propagation in the context of early warning systems. EasyWave supports making use of OpenMP for CPU multi-threading and there are also GPU ports available but not currently incorporated as part of this test profile. The easyWave tsunami generation software is run with one of the example/reference input files for measuring the CPU execution time. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BettereasyWave r34Input: e2Asean Grid + BengkuluSept2007 Source - Time: 1200HEX ModeSNC3 - Default1326395265SE +/- 0.84, N = 12SE +/- 0.59, N = 559.6052.971. (CXX) g++ options: -O3 -fopenmp

OpenRadioss

OpenRadioss is an open-source AGPL-licensed finite element solver for dynamic event analysis OpenRadioss is based on Altair Radioss and open-sourced in 2022. This open-source finite element solver is benchmarked with various example models available from https://www.openradioss.org/models/ and https://github.com/OpenRadioss/ModelExchange/tree/main/Examples. This test is currently using a reference OpenRadioss binary build offered via GitHub. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2023.09.15Model: Bird Strike on WindshieldHEX ModeSNC3 - Default4080120160200SE +/- 0.18, N = 3SE +/- 0.38, N = 3172.45154.56

PyHPC Benchmarks

PyHPC-Benchmarks is a suite of Python high performance computing benchmarks for execution on CPUs and GPUs using various popular Python HPC libraries. The PyHPC CPU-based benchmarks focus on sequential CPU performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterPyHPC Benchmarks 3.0Device: CPU - Backend: Numpy - Project Size: 4194304 - Benchmark: Equation of StateHEX ModeSNC3 - Default0.38540.77081.15621.54161.927SE +/- 0.012, N = 3SE +/- 0.008, N = 31.7131.546

Algebraic Multi-Grid Benchmark

AMG is a parallel algebraic multigrid solver for linear systems arising from problems on unstructured grids. The driver provided with AMG builds linear systems for various 3-dimensional problems. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFigure Of Merit, More Is BetterAlgebraic Multi-Grid Benchmark 1.2HEX ModeSNC3 - Default2000M4000M6000M8000M10000MSE +/- 20523024.36, N = 3SE +/- 11711915.09, N = 3770839500085042013331. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -lmpi

Xcompact3d Incompact3d

Xcompact3d Incompact3d is a Fortran-MPI based, finite difference high-performance code for solving the incompressible Navier-Stokes equation and as many as you need scalar transport equations. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 193 Cells Per DirectionHEX ModeSNC3 - Default0.63491.26981.90472.53963.1745SE +/- 0.02990610, N = 15SE +/- 0.01327750, N = 32.821902662.564382311. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

RELION

RELION - REgularised LIkelihood OptimisatioN - is a stand-alone computer program for Maximum A Posteriori refinement of (multiple) 3D reconstructions or 2D class averages in cryo-electron microscopy (cryo-EM). It is developed in the research group of Sjors Scheres at the MRC Laboratory of Molecular Biology. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRELION 4.0.1Test: Basic - Device: CPUHEX ModeSNC3 - Default306090120150SE +/- 0.83, N = 3SE +/- 0.54, N = 3114.40104.051. (CXX) g++ options: -fopenmp -std=c++11 -O3 -rdynamic -ldl -ltiff -lfftw3f -lfftw3 -lpng -ljpeg -lmpi_cxx -lmpi

ACES DGEMM

OpenBenchmarking.orgGFLOP/s, More Is BetterACES DGEMM 1.0Sustained Floating-Point RateHEX ModeSNC3 - Default8001600240032004000SE +/- 5.73, N = 3SE +/- 6.73, N = 33865.663545.811. (CC) gcc options: -ffast-math -mavx2 -O3 -fopenmp -lopenblas

Timed LLVM Compilation

This test times how long it takes to compile/build the LLVM compiler stack. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 16.0Build System: Unix MakefilesHEX ModeSNC3 - Default50100150200250SE +/- 1.09, N = 3SE +/- 0.78, N = 3214.44197.13

Xcompact3d Incompact3d

Xcompact3d Incompact3d is a Fortran-MPI based, finite difference high-performance code for solving the incompressible Navier-Stokes equation and as many as you need scalar transport equations. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 129 Cells Per DirectionHEX ModeSNC3 - Default0.20320.40640.60960.81281.016SE +/- 0.011942737, N = 3SE +/- 0.005913486, N = 150.9031983220.8319549961. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

Graph500

This is a benchmark of the reference implementation of Graph500, an HPC benchmark focused on data intensive loads and commonly tested on supercomputers for complex data problems. Graph500 primarily stresses the communication subsystem of the hardware under test. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgbfs max_TEPS, More Is BetterGraph500 3.0Scale: 26HEX ModeSNC3 - Default500M1000M1500M2000M2500M195467000021162400001. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi

OpenRadioss

OpenRadioss is an open-source AGPL-licensed finite element solver for dynamic event analysis OpenRadioss is based on Altair Radioss and open-sourced in 2022. This open-source finite element solver is benchmarked with various example models available from https://www.openradioss.org/models/ and https://github.com/OpenRadioss/ModelExchange/tree/main/Examples. This test is currently using a reference OpenRadioss binary build offered via GitHub. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2023.09.15Model: Chrysler Neon 1MHEX ModeSNC3 - Default1530456075SE +/- 0.38, N = 3SE +/- 0.37, N = 366.4661.48

NWChem

NWChem is an open-source high performance computational chemistry package. Per NWChem's documentation, "NWChem aims to provide its users with computational chemistry tools that are scalable both in their ability to treat large scientific computational chemistry problems efficiently, and in their use of available parallel computing resources from high-performance parallel supercomputers to conventional workstation clusters." Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterNWChem 7.0.2Input: C240 BuckyballHEX ModeSNC3 - Default4008001200160020001779.71660.61. (F9X) gfortran options: -lnwctask -lccsd -lmcscf -lselci -lmp2 -lmoints -lstepper -ldriver -loptim -lnwdft -lgradients -lcphf -lesp -lddscf -ldangchang -lguess -lhessian -lvib -lnwcutil -lrimp2 -lproperty -lsolvation -lnwints -lprepar -lnwmd -lnwpw -lofpw -lpaw -lpspw -lband -lnwpwlib -lcafe -lspace -lanalyze -lqhop -lpfft -ldplot -ldrdy -lvscf -lqmmm -lqmd -letrans -ltce -lbq -lmm -lcons -lperfm -ldntmc -lccca -ldimqm -lga -larmci -lpeigs -l64to32 -lopenblas -lpthread -lrt -llapack -lnwcblas -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz -lcomex -m64 -ffast-math -std=legacy -fdefault-integer-8 -finline-functions -O2

High Performance Conjugate Gradient

HPCG is the High Performance Conjugate Gradient and is a new scientific benchmark from Sandia National Lans focused for super-computer testing with modern real-world workloads compared to HPCC. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 160 160 160 - RT: 60HEX ModeSNC3 - Default4080120160200SE +/- 0.29, N = 3SE +/- 0.02, N = 3159.84170.001. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 144 144 144 - RT: 60HEX ModeSNC3 - Default4080120160200SE +/- 0.22, N = 3SE +/- 0.29, N = 3161.92171.211. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

DaCapo Benchmark

This test runs the DaCapo Benchmarks written in Java and intended to test system/CPU performance of various popular real-world Java workloads. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgmsec, Fewer Is BetterDaCapo Benchmark 23.11Java Test: BioJava Biological Data FrameworkHEX ModeSNC3 - Default12002400360048006000SE +/- 50.95, N = 8SE +/- 42.20, N = 1557915503

Darmstadt Automotive Parallel Heterogeneous Suite

DAPHNE is the Darmstadt Automotive Parallel HeterogeNEous Benchmark Suite with OpenCL / CUDA / OpenMP test cases for these automotive benchmarks for evaluating programming models in context to vehicle autonomous driving capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTest Cases Per Minute, More Is BetterDarmstadt Automotive Parallel Heterogeneous Suite 2021.11.02Backend: OpenMP - Kernel: NDT MappingHEX ModeSNC3 - Default120240360480600SE +/- 1.66, N = 3SE +/- 8.56, N = 12547.43574.841. (CXX) g++ options: -O3 -std=c++11 -fopenmp

libxsmm

Libxsmm is an open-source library for specialized dense and sparse matrix operations and deep learning primitives. Libxsmm supports making use of Intel AMX, AVX-512, and other modern CPU instruction set capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 64HEX ModeSNC3 - Default12002400360048006000SE +/- 68.57, N = 15SE +/- 63.47, N = 155554.05826.51. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

SVT-AV1

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 8 - Input: Bosphorus 4KHEX ModeSNC3 - Default1530456075SE +/- 0.74, N = 3SE +/- 0.65, N = 366.9963.971. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

Xcompact3d Incompact3d

Xcompact3d Incompact3d is a Fortran-MPI based, finite difference high-performance code for solving the incompressible Navier-Stokes equation and as many as you need scalar transport equations. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: X3D-benchmarking input.i3dHEX ModeSNC3 - Default1632486480SE +/- 0.09, N = 3SE +/- 0.42, N = 372.1168.891. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

High Performance Conjugate Gradient

HPCG is the High Performance Conjugate Gradient and is a new scientific benchmark from Sandia National Lans focused for super-computer testing with modern real-world workloads compared to HPCC. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 104 104 104 - RT: 60HEX ModeSNC3 - Default4080120160200SE +/- 0.15, N = 3SE +/- 0.92, N = 3168.78176.011. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

libxsmm

Libxsmm is an open-source library for specialized dense and sparse matrix operations and deep learning primitives. Libxsmm supports making use of Intel AMX, AVX-512, and other modern CPU instruction set capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 128HEX ModeSNC3 - Default2K4K6K8K10KSE +/- 151.59, N = 9SE +/- 103.51, N = 37862.28182.91. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

OpenRadioss

OpenRadioss is an open-source AGPL-licensed finite element solver for dynamic event analysis OpenRadioss is based on Altair Radioss and open-sourced in 2022. This open-source finite element solver is benchmarked with various example models available from https://www.openradioss.org/models/ and https://github.com/OpenRadioss/ModelExchange/tree/main/Examples. This test is currently using a reference OpenRadioss binary build offered via GitHub. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2023.09.15Model: Cell Phone Drop TestHEX ModeSNC3 - Default918273645SE +/- 0.29, N = 3SE +/- 0.06, N = 340.7339.18

SVT-AV1

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 13 - Input: Beauty 4K 10-bitHEX ModeSNC3 - Default3691215SE +/- 0.00, N = 3SE +/- 0.02, N = 313.4112.981. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

Darmstadt Automotive Parallel Heterogeneous Suite

DAPHNE is the Darmstadt Automotive Parallel HeterogeNEous Benchmark Suite with OpenCL / CUDA / OpenMP test cases for these automotive benchmarks for evaluating programming models in context to vehicle autonomous driving capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTest Cases Per Minute, More Is BetterDarmstadt Automotive Parallel Heterogeneous Suite 2021.11.02Backend: OpenMP - Kernel: Points2ImageHEX ModeSNC3 - Default9001800270036004500SE +/- 47.68, N = 15SE +/- 52.66, N = 34243.694375.311. (CXX) g++ options: -O3 -std=c++11 -fopenmp

SPECFEM3D

simulates acoustic (fluid), elastic (solid), coupled acoustic/elastic, poroelastic or seismic wave propagation in any type of conforming mesh of hexahedra. This test profile currently relies on CPU-based execution for SPECFEM3D and using a variety of their built-in examples/models for benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.1.1Model: Layered HalfspaceHEX ModeSNC3 - Default246810SE +/- 0.036865236, N = 3SE +/- 0.060537089, N = 37.4226885187.2075692591. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenRadioss

OpenRadioss is an open-source AGPL-licensed finite element solver for dynamic event analysis OpenRadioss is based on Altair Radioss and open-sourced in 2022. This open-source finite element solver is benchmarked with various example models available from https://www.openradioss.org/models/ and https://github.com/OpenRadioss/ModelExchange/tree/main/Examples. This test is currently using a reference OpenRadioss binary build offered via GitHub. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2023.09.15Model: INIVOL and Fluid Structure Interaction Drop ContainerHEX ModeSNC3 - Default20406080100SE +/- 0.28, N = 3SE +/- 0.25, N = 393.1390.63

Blender

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.2Blend File: BMW27 - Compute: CPU-OnlyHEX ModeSNC3 - Default246810SE +/- 0.05, N = 3SE +/- 0.02, N = 37.697.49

SVT-AV1

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 8 - Input: Beauty 4K 10-bitHEX ModeSNC3 - Default246810SE +/- 0.009, N = 3SE +/- 0.055, N = 38.1917.9911. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

GPAW

GPAW is a density-functional theory (DFT) Python code based on the projector-augmented wave (PAW) method and the atomic simulation environment (ASE). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterGPAW 23.6Input: Carbon NanotubeHEX ModeSNC3 - Default20406080100SE +/- 0.28, N = 3SE +/- 0.72, N = 388.8786.881. (CC) gcc options: -shared -fwrapv -O2 -lxc -lblas -lmpi

SPECFEM3D

simulates acoustic (fluid), elastic (solid), coupled acoustic/elastic, poroelastic or seismic wave propagation in any type of conforming mesh of hexahedra. This test profile currently relies on CPU-based execution for SPECFEM3D and using a variety of their built-in examples/models for benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.1.1Model: Tomographic ModelHEX ModeSNC3 - Default1.00092.00183.00274.00365.0045SE +/- 0.014802330, N = 3SE +/- 0.014525946, N = 34.4486445524.3612348561. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

DaCapo Benchmark

This test runs the DaCapo Benchmarks written in Java and intended to test system/CPU performance of various popular real-world Java workloads. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgmsec, Fewer Is BetterDaCapo Benchmark 23.11Java Test: Apache KafkaHEX ModeSNC3 - Default13002600390052006500SE +/- 1.20, N = 3SE +/- 2.60, N = 359956098

SVT-AV1

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 5 - Input: Beauty 4K 10-bitHEX ModeSNC3 - Default1.29892.59783.89675.19566.4945SE +/- 0.018, N = 3SE +/- 0.006, N = 35.6845.7731. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

7-Zip Compression

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 24.05Test: Compression RatingHEX ModeSNC3 - Default200K400K600K800K1000KSE +/- 5282.93, N = 3SE +/- 8001.79, N = 39117748977251. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

SPECFEM3D

simulates acoustic (fluid), elastic (solid), coupled acoustic/elastic, poroelastic or seismic wave propagation in any type of conforming mesh of hexahedra. This test profile currently relies on CPU-based execution for SPECFEM3D and using a variety of their built-in examples/models for benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.1.1Model: Homogeneous HalfspaceHEX ModeSNC3 - Default1.24062.48123.72184.96246.203SE +/- 0.007527831, N = 3SE +/- 0.012290725, N = 35.5139720445.4331742391. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

BYTE Unix Benchmark

OpenBenchmarking.orgLPS, More Is BetterBYTE Unix Benchmark 5.1.3-gitComputational Test: System CallHEX ModeSNC3 - Default200M400M600M800M1000MSE +/- 144616.99, N = 3SE +/- 219845.26, N = 3910572120.1897496230.91. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm

GROMACS

The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2024Implementation: MPI CPU - Input: water_GMX50_bareHEX ModeSNC3 - Default816243240SE +/- 0.05, N = 3SE +/- 0.10, N = 332.1432.591. (CXX) g++ options: -O3 -lm

LAMMPS Molecular Dynamics Simulator

LAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 23Jun2022Model: 20k AtomsHEX ModeSNC3 - Default20406080100SE +/- 0.33, N = 3SE +/- 0.21, N = 392.7594.051. (CXX) g++ options: -O3 -lm -ldl

PyHPC Benchmarks

PyHPC-Benchmarks is a suite of Python high performance computing benchmarks for execution on CPUs and GPUs using various popular Python HPC libraries. The PyHPC CPU-based benchmarks focus on sequential CPU performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterPyHPC Benchmarks 3.0Device: CPU - Backend: Numpy - Project Size: 4194304 - Benchmark: Isoneutral MixingHEX ModeSNC3 - Default0.43450.8691.30351.7382.1725SE +/- 0.011, N = 3SE +/- 0.020, N = 31.9311.907

7-Zip Compression

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 24.05Test: Decompression RatingHEX ModeSNC3 - Default300K600K900K1200K1500KSE +/- 5512.25, N = 3SE +/- 10392.40, N = 3137797913949471. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

Blender

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.2Blend File: Pabellon Barcelona - Compute: CPU-OnlyHEX ModeSNC3 - Default510152025SE +/- 0.03, N = 3SE +/- 0.05, N = 321.5421.28

PostgreSQL

This is a benchmark of PostgreSQL using the integrated pgbench for facilitating the database benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL 16Scaling Factor: 100 - Clients: 1000 - Mode: Read Write - Average LatencyHEX ModeSNC3 - Default1632486480SE +/- 0.01, N = 3SE +/- 0.20, N = 373.2374.041. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL 16Scaling Factor: 100 - Clients: 1000 - Mode: Read WriteHEX ModeSNC3 - Default3K6K9K12K15KSE +/- 2.00, N = 3SE +/- 37.41, N = 313656135061. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

SVT-AV1

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 5 - Input: Bosphorus 4KHEX ModeSNC3 - Default714212835SE +/- 0.19, N = 3SE +/- 0.35, N = 331.1930.861. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

easyWave

The easyWave software allows simulating tsunami generation and propagation in the context of early warning systems. EasyWave supports making use of OpenMP for CPU multi-threading and there are also GPU ports available but not currently incorporated as part of this test profile. The easyWave tsunami generation software is run with one of the example/reference input files for measuring the CPU execution time. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BettereasyWave r34Input: e2Asean Grid + BengkuluSept2007 Source - Time: 2400HEX ModeSNC3 - Default306090120150SE +/- 1.63, N = 12SE +/- 1.82, N = 12140.44141.911. (CXX) g++ options: -O3 -fopenmp

SPECFEM3D

simulates acoustic (fluid), elastic (solid), coupled acoustic/elastic, poroelastic or seismic wave propagation in any type of conforming mesh of hexahedra. This test profile currently relies on CPU-based execution for SPECFEM3D and using a variety of their built-in examples/models for benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.1.1Model: Water-layered HalfspaceHEX ModeSNC3 - Default3691215SE +/- 0.016038732, N = 3SE +/- 0.054952663, N = 39.3212822919.2438509811. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

BYTE Unix Benchmark

OpenBenchmarking.orgLPS, More Is BetterBYTE Unix Benchmark 5.1.3-gitComputational Test: Dhrystone 2HEX ModeSNC3 - Default4000M8000M12000M16000M20000MSE +/- 9501831.17, N = 3SE +/- 22131931.33, N = 318804694499.018658833934.31. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm

Blender

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.2Blend File: Barbershop - Compute: CPU-OnlyHEX ModeSNC3 - Default1530456075SE +/- 0.18, N = 3SE +/- 0.33, N = 368.4068.00

SPECFEM3D

simulates acoustic (fluid), elastic (solid), coupled acoustic/elastic, poroelastic or seismic wave propagation in any type of conforming mesh of hexahedra. This test profile currently relies on CPU-based execution for SPECFEM3D and using a variety of their built-in examples/models for benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.1.1Model: Mount St. HelensHEX ModeSNC3 - Default0.90431.80862.71293.61724.5215SE +/- 0.026444752, N = 3SE +/- 0.037097693, N = 74.0189747733.9957103661. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

Blender

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.2Blend File: Fishy Cat - Compute: CPU-OnlyHEX ModeSNC3 - Default3691215SE +/- 0.10, N = 3SE +/- 0.12, N = 310.4910.43

SVT-AV1

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 3 - Input: Bosphorus 4KHEX ModeSNC3 - Default3691215SE +/- 0.031, N = 3SE +/- 0.011, N = 39.1069.1551. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

Blender

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.2Blend File: Junkshop - Compute: CPU-OnlyHEX ModeSNC3 - Default3691215SE +/- 0.06, N = 3SE +/- 0.03, N = 310.8910.85

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.2Blend File: Classroom - Compute: CPU-OnlyHEX ModeSNC3 - Default48121620SE +/- 0.04, N = 3SE +/- 0.14, N = 317.5117.55

DaCapo Benchmark

This test runs the DaCapo Benchmarks written in Java and intended to test system/CPU performance of various popular real-world Java workloads. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgmsec, Fewer Is BetterDaCapo Benchmark 23.11Java Test: JythonHEX ModeSNC3 - Default8001600240032004000SE +/- 22.30, N = 3SE +/- 41.68, N = 335683562

BYTE Unix Benchmark

OpenBenchmarking.orgMWIPS, More Is BetterBYTE Unix Benchmark 5.1.3-gitComputational Test: Whetstone DoubleHEX ModeSNC3 - Default800K1600K2400K3200K4000KSE +/- 113.17, N = 3SE +/- 382.28, N = 33728535.63722311.41. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm

OpenRadioss

OpenRadioss is an open-source AGPL-licensed finite element solver for dynamic event analysis OpenRadioss is based on Altair Radioss and open-sourced in 2022. This open-source finite element solver is benchmarked with various example models available from https://www.openradioss.org/models/ and https://github.com/OpenRadioss/ModelExchange/tree/main/Examples. This test is currently using a reference OpenRadioss binary build offered via GitHub. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2023.09.15Model: Bumper BeamHEX ModeSNC3 - Default20406080100SE +/- 0.36, N = 3SE +/- 0.43, N = 3110.68110.55

PostgreSQL

This is a benchmark of PostgreSQL using the integrated pgbench for facilitating the database benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL 16Scaling Factor: 100 - Clients: 1000 - Mode: Read Only - Average LatencyHEX ModeSNC3 - Default0.41470.82941.24411.65882.0735SE +/- 0.108, N = 12SE +/- 0.023, N = 31.6451.8431. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL 16Scaling Factor: 100 - Clients: 1000 - Mode: Read OnlyHEX ModeSNC3 - Default140K280K420K560K700KSE +/- 42784.99, N = 12SE +/- 6739.41, N = 36385805426521. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

Darmstadt Automotive Parallel Heterogeneous Suite

DAPHNE is the Darmstadt Automotive Parallel HeterogeNEous Benchmark Suite with OpenCL / CUDA / OpenMP test cases for these automotive benchmarks for evaluating programming models in context to vehicle autonomous driving capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTest Cases Per Minute, More Is BetterDarmstadt Automotive Parallel Heterogeneous Suite 2021.11.02Backend: OpenMP - Kernel: Euclidean ClusterHEX ModeSNC3 - Default150300450600750SE +/- 5.87, N = 3SE +/- 15.97, N = 12676.38612.851. (CXX) g++ options: -O3 -std=c++11 -fopenmp

Stockfish

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 17Chess BenchmarkHEX ModeSNC3 - Default130M260M390M520M650MSE +/- 10082654.90, N = 9SE +/- 16310733.41, N = 65661414155874743171. (CXX) g++ options: -lgcov -m64 -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -funroll-loops -msse -msse3 -mpopcnt -mavx2 -mbmi -mavx512f -mavx512bw -mavx512vnni -mavx512dq -mavx512vl -msse4.1 -mssse3 -msse2 -mbmi2 -flto -flto-partition=one -flto=jobserver

SVT-AV1

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 13 - Input: Bosphorus 4KHEX ModeSNC3 - Default4080120160200SE +/- 4.03, N = 12SE +/- 0.39, N = 3195.99193.491. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

DaCapo Benchmark

This test runs the DaCapo Benchmarks written in Java and intended to test system/CPU performance of various popular real-world Java workloads. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgmsec, Fewer Is BetterDaCapo Benchmark 23.11Java Test: H2 Database EngineHEX ModeSNC3 - Default4K8K12K16K20KSE +/- 372.20, N = 15SE +/- 336.25, N = 151717811238

OpenBenchmarking.orgmsec, Fewer Is BetterDaCapo Benchmark 23.11Java Test: Apache Xalan XSLTHEX ModeSNC3 - Default6001200180024003000SE +/- 40.23, N = 15SE +/- 50.69, N = 1525002595

OpenBenchmarking.orgmsec, Fewer Is BetterDaCapo Benchmark 23.11Java Test: Apache TomcatHEX ModeSNC3 - Default2K4K6K8K10KSE +/- 144.35, N = 15SE +/- 74.15, N = 1589339223

LAMMPS Molecular Dynamics Simulator

LAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 23Jun2022Model: Rhodopsin ProteinHEX ModeSNC3 - Default1632486480SE +/- 0.88, N = 3SE +/- 1.30, N = 1270.4771.401. (CXX) g++ options: -O3 -lm -ldl

OpenRadioss

OpenRadioss is an open-source AGPL-licensed finite element solver for dynamic event analysis OpenRadioss is based on Altair Radioss and open-sourced in 2022. This open-source finite element solver is benchmarked with various example models available from https://www.openradioss.org/models/ and https://github.com/OpenRadioss/ModelExchange/tree/main/Examples. This test is currently using a reference OpenRadioss binary build offered via GitHub. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2023.09.15Model: Rubber O-Ring Seal InstallationHEX ModeSNC3 - Default50100150200250SE +/- 6.59, N = 9SE +/- 3.87, N = 12231.10219.85

libxsmm

Libxsmm is an open-source library for specialized dense and sparse matrix operations and deep learning primitives. Libxsmm supports making use of Intel AMX, AVX-512, and other modern CPU instruction set capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 32HEX ModeSNC3 - Default8001600240032004000SE +/- 68.51, N = 12SE +/- 58.49, N = 123214.13507.41. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 256HEX ModeSNC3 - Default9001800270036004500SE +/- 36.38, N = 3SE +/- 124.41, N = 152706.04213.91. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

NAMD

OpenBenchmarking.orgns/day, More Is BetterNAMD 3.0Input: STMV with 1,066,628 AtomsHEX ModeSNC3 - Default0.57511.15021.72532.30042.8755SE +/- 0.06201, N = 15SE +/- 0.01818, N = 132.555961.97540

miniBUDE

MiniBUDE is a mini application for the the core computation of the Bristol University Docking Engine (BUDE). This test profile currently makes use of the OpenMP implementation of miniBUDE for CPU benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBillion Interactions/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM2HEX ModeSNC3 - Default60120180240300SE +/- 2.64, N = 6SE +/- 6.07, N = 12283.76268.531. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

OpenBenchmarking.orgGFInst/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM2HEX ModeSNC3 - Default15003000450060007500SE +/- 66.01, N = 6SE +/- 151.76, N = 127093.936713.241. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm