AMD EPYC 9684X 3D V-Cache

AMD EPYC 9684X 96-Core testing with a AMD Titanite_4G (RTI1007B BIOS) and ASPEED on Ubuntu 22.04 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2307207-NE-UPLOAD92587
Jump To Table - Results

Statistics

Remove Outliers Before Calculating Averages

Graph Settings

Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Result
Identifier
View Logs
Performance Per
Dollar
Date
Run
  Test
  Duration
Default
July 17 2023
  1 Day, 8 Hours, 22 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


AMD EPYC 9684X 3D V-CacheOpenBenchmarking.orgPhoronix Test SuiteAMD EPYC 9684X 96-Core @ 2.55GHz (96 Cores / 192 Threads)AMD Titanite_4G (RTI1007B BIOS)AMD Device 14a4768GB2 x 1920GB SAMSUNG MZWLJ1T9HBJR-00007ASPEEDBroadcom NetXtreme BCM5720 PCIeUbuntu 22.045.19.0-41-generic (x86_64)GNOME Shell 42.5X Server 1.21.1.41.3.224GCC 11.3.0ext41024x768ProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelDesktopDisplay ServerVulkanCompilerFile-SystemScreen ResolutionAMD EPYC 9684X 3D V-Cache BenchmarksSystem Logs- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa101121 - Python 3.10.6- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

AMD EPYC 9684X 3D V-Cachewrf: conus 2.5kmopenfoam: drivaerFastback, Large Mesh Size - Execution Timeopenfoam: drivaerFastback, Large Mesh Size - Mesh Timepetsc: Streamshpcg: 144 144 144 - 60hpcg: 192 192 192 - 60whisper-cpp: ggml-medium.en - 2016 State of the Unionlibxsmm: 128hpcg: 160 160 160 - 60hpcg: 104 104 104 - 60whisper-cpp: ggml-small.en - 2016 State of the Uniontensorflow: CPU - 256 - ResNet-50palabos: 400lczero: BLAStensorflow: CPU - 512 - GoogLeNettensorflow: CPU - 512 - ResNet-50lczero: Eigenwhisper-cpp: ggml-base.en - 2016 State of the Unionaskap: tConvolve MT - Degriddingaskap: tConvolve MT - Griddinglibxsmm: 256lammps: 20k Atomsbuild-linux-kernel: allmodconfigmocassin: Dust 2D tau100.0build-llvm: Unix Makefilespalabos: 500palabos: 1000ospray: particle_volume/pathtracer/real_timenumpy: stress-ng: Socket Activitystress-ng: Context Switchingstress-ng: IO_uringblender: Barbershop - CPU-Onlyospray: particle_volume/scivis/real_timebuild-gem5: Time To Compileospray-studio: 3 - 4K - 32 - Path Tracerasmfish: 1024 Hash Memory, 26 Depthstress-ng: Cloningstress-ng: SENDFILEngspice: C2670ospray-studio: 2 - 4K - 32 - Path Tracerospray-studio: 1 - 4K - 32 - Path Tracerospray-studio: 3 - 4K - 1 - Path Tracerbuild-llvm: Ninjaospray-studio: 2 - 4K - 1 - Path Tracerospray-studio: 1 - 4K - 1 - Path Tracerbuild-nodejs: Time To Compilengspice: C7552openfoam: drivaerFastback, Medium Mesh Size - Execution Timeopenfoam: drivaerFastback, Medium Mesh Size - Mesh Timeospray-studio: 2 - 4K - 16 - Path Tracerpalabos: 100ospray-studio: 1 - 4K - 16 - Path Tracerospray-studio: 3 - 4K - 16 - Path Tracerospray: particle_volume/ao/real_timebuild-godot: Time To Compilepyhpc: CPU - Numpy - 4194304 - Isoneutral Mixingcompress-zstd: 19, Long Mode - Decompression Speedcompress-zstd: 19, Long Mode - Compression Speedtensorflow: CPU - 64 - ResNet-50tensorflow: CPU - 256 - GoogLeNetstockfish: Total Timelaghos: Sedov Blast Wave, ube_922_hex.meshcompress-zstd: 19 - Decompression Speedcompress-zstd: 19 - Compression Speedopenvino: Person Detection FP32 - CPUopenvino: Person Detection FP32 - CPUopenvino: Person Detection FP16 - CPUopenvino: Person Detection FP16 - CPUopenvino: Face Detection FP16 - CPUopenvino: Face Detection FP16 - CPUopenvino: Face Detection FP16-INT8 - CPUopenvino: Face Detection FP16-INT8 - CPUcompress-zstd: 12 - Decompression Speedcompress-zstd: 12 - Compression Speedcompress-zstd: 3, Long Mode - Decompression Speedcompress-zstd: 3, Long Mode - Compression Speedcompress-zstd: 3 - Decompression Speedcompress-zstd: 3 - Compression Speedcompress-zstd: 8, Long Mode - Decompression Speedcompress-zstd: 8, Long Mode - Compression Speedcompress-zstd: 8 - Decompression Speedcompress-zstd: 8 - Compression Speedopenvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Machine Translation EN To DE FP16 - CPUopenvino: Machine Translation EN To DE FP16 - CPUopenvino: Age Gender Recognition Retail 0013 FP16 - CPUopenvino: Age Gender Recognition Retail 0013 FP16 - CPUopenvino: Vehicle Detection FP16-INT8 - CPUopenvino: Vehicle Detection FP16-INT8 - CPUopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUopenvino: Weld Porosity Detection FP16-INT8 - CPUopenvino: Weld Porosity Detection FP16-INT8 - CPUopenvino: Vehicle Detection FP16 - CPUopenvino: Vehicle Detection FP16 - CPUopenvino: Weld Porosity Detection FP16 - CPUopenvino: Weld Porosity Detection FP16 - CPUospray: gravity_spheres_volume/dim_512/scivis/real_timeospray: gravity_spheres_volume/dim_512/ao/real_timeospray: gravity_spheres_volume/dim_512/pathtracer/real_timedeepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Streamdeepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Streamblender: Pabellon Barcelona - CPU-Onlydeepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Streamdeepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Streamtensorflow: CPU - 32 - ResNet-50laghos: Triple Point Problemdeepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Streamtensorflow: CPU - 512 - AlexNetdeepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Streamdeepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Streamdeepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Streamdeepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Streamblender: Classroom - CPU-Onlydeepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Streamdeepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Streamdeepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Streambuild-linux-kernel: defconfiggpaw: Carbon Nanotubetensorflow: CPU - 16 - ResNet-50compress-7zip: Decompression Ratingcompress-7zip: Compression Ratingbuild-php: Time To Compilestress-ng: CPU Cachesrsran: PUSCH Processor Benchmark, Throughput Totalstress-ng: Atomicstress-ng: MMAPstress-ng: Pthreadstress-ng: MEMFDliquid-dsp: 192 - 256 - 512stress-ng: Mallocliquid-dsp: 128 - 256 - 512stress-ng: Zlibstress-ng: Matrix 3D Mathliquid-dsp: 64 - 256 - 512stress-ng: Glibc Qsort Data Sortingstress-ng: Vector Mathstress-ng: Forkingstress-ng: NUMAstress-ng: Futexstress-ng: Glibc C String Functionsstress-ng: Pollstress-ng: Wide Vector Mathstress-ng: AVL Treestress-ng: Memory Copyingstress-ng: Floating Pointstress-ng: Function Callliquid-dsp: 192 - 256 - 57stress-ng: Fused Multiply-Addstress-ng: Cryptostress-ng: Pipestress-ng: Hashstress-ng: Vector Floating Pointstress-ng: Vector Shufflestress-ng: Mutexliquid-dsp: 192 - 256 - 32liquid-dsp: 128 - 256 - 57stress-ng: Semaphoresstress-ng: CPU Stressstress-ng: System V Message Passingstress-ng: Matrix Mathliquid-dsp: 128 - 256 - 32liquid-dsp: 64 - 256 - 57liquid-dsp: 64 - 256 - 32askap: tConvolve MPI - Griddingaskap: tConvolve MPI - Degriddingheffte: c2c - FFTW - double - 512namd: ATPase Simulation - 327,506 Atomspyhpc: CPU - Numpy - 4194304 - Equation of Statetensorflow: CPU - 64 - GoogLeNetamg: gromacs: MPI CPU - water_GMX50_baretensorflow: CPU - 256 - AlexNetspecfem3d: Water-layered Halfspacespecfem3d: Layered Halfspaceblender: Fishy Cat - CPU-Onlyremhos: Sample Remap Examplelulesh: heffte: r2c - FFTW - double - 512xmrig: Monero - 1Msrsran: Downlink Processor Benchmarkopenfoam: drivaerFastback, Small Mesh Size - Execution Timeopenfoam: drivaerFastback, Small Mesh Size - Mesh Timexmrig: Wownero - 1Membree: Pathtracer ISPC - Asian Dragon Objtensorflow: CPU - 16 - GoogLeNetspecfem3d: Tomographic Modelspecfem3d: Mount St. Helensspecfem3d: Homogeneous Halfspacenpb: EP.Dmocassin: Gas HII40heffte: c2c - FFTW - float - 512tensorflow: CPU - 64 - AlexNetaskap: Hogbom Clean OpenMPtensorflow: CPU - 32 - GoogLeNetcloverleaf: Lagrangian-Eulerian Hydrodynamicsblender: BMW27 - CPU-Onlynpb: BT.Castcenc: Thoroughastcenc: Exhaustivenpb: SP.Cminife: Smalltensorflow: CPU - 16 - AlexNetincompact3d: input.i3d 129 Cells Per Directiondraco: Liondraco: Church Facadeaskap: tConvolve OpenMP - Degriddingaskap: tConvolve OpenMP - Griddingnpb: IS.Dincompact3d: input.i3d 193 Cells Per Directiontensorflow: CPU - 32 - AlexNetembree: Pathtracer ISPC - Crownmt-dgemm: Sustained Floating-Point Ratenpb: LU.Cheffte: r2c - FFTW - float - 512libxsmm: 32astcenc: Mediumlibxsmm: 64embree: Pathtracer ISPC - Asian Dragonnpb: FT.Cnpb: CG.Cnpb: MG.Cheffte: c2c - FFTW - double - 256heffte: r2c - FFTW - double - 256heffte: c2c - FFTW - float - 256heffte: r2c - FFTW - float - 256heffte: c2c - FFTW - double - 128heffte: r2c - FFTW - double - 128heffte: c2c - FFTW - float - 128heffte: r2c - FFTW - float - 128stress-ng: x86_64 RdRandDefault11269.2628994.4651585.13662272616.799424.614322.83321444.135832913.423.836926.8250769.11381115.51317.8209760409.02116.1511884332.0151415603.213582.83064.439.893202.557190.071184.047328.713370.189199.695586.502960.2415480601.674790428.16142.0325.1088137.6914039223404087012478.741517063.84118.39434341340071261112.88710651059105.227100.850181.01759108.3682317078591.534169532022325.157188.3841.5781357.88.5294.50387.68297289892446.931433.417.31766.5426.891753.6327.111003.5147.62529.6890.361653.3329.31534.5889.91498.83989.71648.1876.91639.91226.78.365732.7589.50535.770.8494505.118.455672.061.3364461.9310.848846.6712.423860.6410.284663.9725.927226.792326.5145417.8584114.177749.61783.575860.6307785.212260.584078.11238.89185.8506257.57121287.81145.0544329.977339.30751219.495840.5193.3229513.186460.0637797.9471139.0157344.405322.91934.77457.2162647264706333.4461397574.7518408.1237.831444.92103280.79454.751287866667360348999.65108110000010471.1716595.287275233332108.30545725.7340400.832478.543985709.5681208060.1713269096.623485374.331665.5732994.1829883.9367313.39463463333376566577.12202368.6160302912.6718746640.55257925.6163761.2949736118.0649585333333876733333223213939.86212380.7612085427.22418033.1337687333332609466667218126666773226.759791.167.85030.247330.767296.19241428300011.7931196.1621.11805134821.34796928920.6210.29330715.328134.82069684.3721.329.87796322.97411774009.7123.5135156.948.7389035858.89814524011.32314013610697.9013.278153.844786.601212.17238.1810.2916.28314777.9056.77806.1411207614.7054408.2303.802.175216135011604355153.026625.65696.517.68528681518.15117.826140.553180337910.64332.7421311.3419.42652455.4143.9954118831.8959737.94137308.2787.0392190.126178.441315.81379.9238129.875126.028185.389OpenBenchmarking.org

WRF

WRF, the Weather Research and Forecasting Model, is a "next-generation mesoscale numerical weather prediction system designed for both atmospheric research and operational forecasting applications. It features two dynamical cores, a data assimilation system, and a software architecture supporting parallel computation and system extensibility." Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterWRF 4.2.2Input: conus 2.5kmDefault2K4K6K8K10K11269.261. (F9X) gfortran options: -O2 -ftree-vectorize -funroll-loops -ffree-form -fconvert=big-endian -frecord-marker=4 -fallow-invalid-boz -lesmf_time -lwrfio_nf -lnetcdff -lnetcdf -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenFOAM

OpenFOAM is the leading free, open-source software for computational fluid dynamics (CFD). This test profile currently uses the drivaerFastback test case for analyzing automotive aerodynamics or alternatively the older motorBike input. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Large Mesh Size - Execution TimeDefault2K4K6K8K10K8994.471. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Large Mesh Size - Mesh TimeDefault130260390520650585.141. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

PETSc

PETSc, the Portable, Extensible Toolkit for Scientific Computation, is for the scalable (parallel) solution of scientific applications modeled by partial differential equations. This test profile runs the PETSc "make streams" benchmark and records the throughput rate when all available cores are utilized for the MPI Streams build. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMB/s, More Is BetterPETSc 3.19Test: StreamsDefault60K120K180K240K300KSE +/- 6007.86, N = 9272616.801. (CC) gcc options: -fPIC -O3 -O2 -lpthread -ludev -lpciaccess -lm

High Performance Conjugate Gradient

HPCG is the High Performance Conjugate Gradient and is a new scientific benchmark from Sandia National Lans focused for super-computer testing with modern real-world workloads compared to HPCC. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 144 144 144 - RT: 60Default612182430SE +/- 0.44, N = 924.611. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 192 192 192 - RT: 60Default510152025SE +/- 0.18, N = 322.831. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

Whisper.cpp

Whisper.cpp is a port of OpenAI's Whisper model in C/C++. Whisper.cpp is developed by Georgi Gerganov for transcribing WAV audio files to text / speech recognition. Whisper.cpp supports ARM NEON, x86 AVX, and other advanced CPU features. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterWhisper.cpp 1.4Model: ggml-medium.en - Input: 2016 State of the UnionDefault30060090012001500SE +/- 5.78, N = 31444.141. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread

libxsmm

Libxsmm is an open-source library for specialized dense and sparse matrix operations and deep learning primitives. Libxsmm supports making use of Intel AMX, AVX-512, and other modern CPU instruction set capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 128Default6001200180024003000SE +/- 27.97, N = 92913.41. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

High Performance Conjugate Gradient

HPCG is the High Performance Conjugate Gradient and is a new scientific benchmark from Sandia National Lans focused for super-computer testing with modern real-world workloads compared to HPCC. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 160 160 160 - RT: 60Default612182430SE +/- 0.34, N = 323.841. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 104 104 104 - RT: 60Default612182430SE +/- 0.82, N = 926.831. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

Whisper.cpp

Whisper.cpp is a port of OpenAI's Whisper model in C/C++. Whisper.cpp is developed by Georgi Gerganov for transcribing WAV audio files to text / speech recognition. Whisper.cpp supports ARM NEON, x86 AVX, and other advanced CPU features. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterWhisper.cpp 1.4Model: ggml-small.en - Input: 2016 State of the UnionDefault170340510680850SE +/- 3.33, N = 3769.111. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 256 - Model: ResNet-50Default306090120150SE +/- 1.00, N = 9115.51

Palabos

The Palabos library is a framework for general purpose Computational Fluid Dynamics (CFD). Palabos uses a kernel based on the Lattice Boltzmann method. This test profile uses the Palabos MPI-based Cavity3D benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMega Site Updates Per Second, More Is BetterPalabos 2.3Grid Size: 400Default70140210280350SE +/- 2.86, N = 12317.821. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm

LeelaChessZero

LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: BLASDefault2K4K6K8K10KSE +/- 103.04, N = 597601. (CXX) g++ options: -flto -pthread

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 512 - Model: GoogLeNetDefault90180270360450SE +/- 3.82, N = 12409.02

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 512 - Model: ResNet-50Default306090120150SE +/- 0.93, N = 3116.15

LeelaChessZero

LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: EigenDefault3K6K9K12K15KSE +/- 72.53, N = 3118841. (CXX) g++ options: -flto -pthread

Whisper.cpp

Whisper.cpp is a port of OpenAI's Whisper model in C/C++. Whisper.cpp is developed by Georgi Gerganov for transcribing WAV audio files to text / speech recognition. Whisper.cpp supports ARM NEON, x86 AVX, and other advanced CPU features. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterWhisper.cpp 1.4Model: ggml-base.en - Input: 2016 State of the UnionDefault70140210280350SE +/- 1.67, N = 3332.021. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread

ASKAP

ASKAP is a set of benchmarks from the Australian SKA Pathfinder. The principal ASKAP benchmarks are the Hogbom Clean Benchmark (tHogbomClean) and Convolutional Resamping Benchmark (tConvolve) as well as some previous ASKAP benchmarks being included as well for OpenCL and CUDA execution of tConvolve. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve MT - DegriddingDefault3K6K9K12K15KSE +/- 16.78, N = 315603.21. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve MT - GriddingDefault3K6K9K12K15KSE +/- 1.20, N = 313582.81. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

libxsmm

Libxsmm is an open-source library for specialized dense and sparse matrix operations and deep learning primitives. Libxsmm supports making use of Intel AMX, AVX-512, and other modern CPU instruction set capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 256Default7001400210028003500SE +/- 32.92, N = 53064.41. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

LAMMPS Molecular Dynamics Simulator

LAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 23Jun2022Model: 20k AtomsDefault918273645SE +/- 0.07, N = 339.891. (CXX) g++ options: -O3 -lm -ldl

Timed Linux Kernel Compilation

This test times how long it takes to build the Linux kernel in a default configuration (defconfig) for the architecture being tested or alternatively an allmodconfig for building all possible kernel modules for the build. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 6.1Build: allmodconfigDefault4080120160200SE +/- 0.58, N = 3202.56

Monte Carlo Simulations of Ionised Nebulae

Mocassin is the Monte Carlo Simulations of Ionised Nebulae. MOCASSIN is a fully 3D or 2D photoionisation and dust radiative transfer code which employs a Monte Carlo approach to the transfer of radiation through media of arbitrary geometry and density distribution. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterMonte Carlo Simulations of Ionised Nebulae 2.02.73.3Input: Dust 2D tau100.0Default4080120160200SE +/- 1.70, N = 3190.071. (F9X) gfortran options: -cpp -Jsource/ -ffree-line-length-0 -lm -std=legacy -O2 -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lz

Timed LLVM Compilation

This test times how long it takes to compile/build the LLVM compiler stack. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 16.0Build System: Unix MakefilesDefault4080120160200SE +/- 1.07, N = 3184.05

Palabos

The Palabos library is a framework for general purpose Computational Fluid Dynamics (CFD). Palabos uses a kernel based on the Lattice Boltzmann method. This test profile uses the Palabos MPI-based Cavity3D benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMega Site Updates Per Second, More Is BetterPalabos 2.3Grid Size: 500Default70140210280350SE +/- 0.10, N = 3328.711. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm

OpenBenchmarking.orgMega Site Updates Per Second, More Is BetterPalabos 2.3Grid Size: 1000Default80160240320400SE +/- 0.10, N = 3370.191. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm

OSPRay

Intel OSPRay is a portable ray-tracing engine for high-performance, high-fidelity scientific visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.12Benchmark: particle_volume/pathtracer/real_timeDefault4080120160200SE +/- 0.09, N = 3199.70

Numpy Benchmark

This is a test to obtain the general Numpy performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgScore, More Is BetterNumpy BenchmarkDefault130260390520650SE +/- 0.96, N = 3586.50

Stress-NG

Stress-NG is a Linux stress tool developed by Colin Ian King. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Socket ActivityDefault6001200180024003000SE +/- 1108.03, N = 152960.241. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Context SwitchingDefault3M6M9M12M15MSE +/- 644041.01, N = 1515480601.671. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: IO_uringDefault1000K2000K3000K4000K5000KSE +/- 576446.94, N = 124790428.161. (CXX) g++ options: -O2 -std=gnu99 -lc

Blender

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.6Blend File: Barbershop - Compute: CPU-OnlyDefault306090120150SE +/- 0.01, N = 3142.03

OSPRay

Intel OSPRay is a portable ray-tracing engine for high-performance, high-fidelity scientific visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.12Benchmark: particle_volume/scivis/real_timeDefault612182430SE +/- 0.01, N = 325.11

Timed Gem5 Compilation

This test times how long it takes to compile Gem5. Gem5 is a simulator for computer system architecture research. Gem5 is widely used for computer architecture research within the industry, academia, and more. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Gem5 Compilation 21.2Time To CompileDefault306090120150SE +/- 0.72, N = 3137.69

OSPRay Studio

Intel OSPRay Studio is an open-source, interactive visualization and ray-tracing software package. OSPRay Studio makes use of Intel OSPRay, a portable ray-tracing engine for high-performance, high-fidelity visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 3 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path TracerDefault9K18K27K36K45KSE +/- 43.02, N = 3403921. (CXX) g++ options: -O3 -lm -ldl

asmFish

This is a test of asmFish, an advanced chess benchmark written in Assembly. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes/second, More Is BetterasmFish 2018-07-231024 Hash Memory, 26 DepthDefault50M100M150M200M250MSE +/- 1310608.25, N = 3234040870

Stress-NG

Stress-NG is a Linux stress tool developed by Colin Ian King. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: CloningDefault3K6K9K12K15KSE +/- 972.22, N = 1212478.741. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: SENDFILEDefault300K600K900K1200K1500KSE +/- 117132.40, N = 121517063.841. (CXX) g++ options: -O2 -std=gnu99 -lc

Ngspice

Ngspice is an open-source SPICE circuit simulator. Ngspice was originally based on the Berkeley SPICE electronic circuit simulator. Ngspice supports basic threading using OpenMP. This test profile is making use of the ISCAS 85 benchmark circuits. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterNgspice 34Circuit: C2670Default306090120150SE +/- 0.26, N = 3118.391. (CC) gcc options: -O0 -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lXft -lfontconfig -lXrender -lfreetype -lSM -lICE

OSPRay Studio

Intel OSPRay Studio is an open-source, interactive visualization and ray-tracing software package. OSPRay Studio makes use of Intel OSPRay, a portable ray-tracing engine for high-performance, high-fidelity visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 2 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path TracerDefault7K14K21K28K35KSE +/- 51.48, N = 3343411. (CXX) g++ options: -O3 -lm -ldl

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 1 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path TracerDefault7K14K21K28K35KSE +/- 22.67, N = 3340071. (CXX) g++ options: -O3 -lm -ldl

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path TracerDefault30060090012001500SE +/- 0.33, N = 312611. (CXX) g++ options: -O3 -lm -ldl

Timed LLVM Compilation

This test times how long it takes to compile/build the LLVM compiler stack. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 16.0Build System: NinjaDefault306090120150SE +/- 0.31, N = 3112.89

OSPRay Studio

Intel OSPRay Studio is an open-source, interactive visualization and ray-tracing software package. OSPRay Studio makes use of Intel OSPRay, a portable ray-tracing engine for high-performance, high-fidelity visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 2 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path TracerDefault2004006008001000SE +/- 2.33, N = 310651. (CXX) g++ options: -O3 -lm -ldl

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path TracerDefault2004006008001000SE +/- 1.15, N = 310591. (CXX) g++ options: -O3 -lm -ldl

Timed Node.js Compilation

This test profile times how long it takes to build/compile Node.js itself from source. Node.js is a JavaScript run-time built from the Chrome V8 JavaScript engine while itself is written in C/C++. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Node.js Compilation 19.8.1Time To CompileDefault20406080100SE +/- 0.15, N = 3105.23

Ngspice

Ngspice is an open-source SPICE circuit simulator. Ngspice was originally based on the Berkeley SPICE electronic circuit simulator. Ngspice supports basic threading using OpenMP. This test profile is making use of the ISCAS 85 benchmark circuits. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterNgspice 34Circuit: C7552Default20406080100SE +/- 0.08, N = 3100.851. (CC) gcc options: -O0 -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lXft -lfontconfig -lXrender -lfreetype -lSM -lICE

OpenFOAM

OpenFOAM is the leading free, open-source software for computational fluid dynamics (CFD). This test profile currently uses the drivaerFastback test case for analyzing automotive aerodynamics or alternatively the older motorBike input. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Medium Mesh Size - Execution TimeDefault4080120160200181.021. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Medium Mesh Size - Mesh TimeDefault20406080100108.371. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

OSPRay Studio

Intel OSPRay Studio is an open-source, interactive visualization and ray-tracing software package. OSPRay Studio makes use of Intel OSPRay, a portable ray-tracing engine for high-performance, high-fidelity visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 2 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path TracerDefault4K8K12K16K20KSE +/- 6.03, N = 3170781. (CXX) g++ options: -O3 -lm -ldl

Palabos

The Palabos library is a framework for general purpose Computational Fluid Dynamics (CFD). Palabos uses a kernel based on the Lattice Boltzmann method. This test profile uses the Palabos MPI-based Cavity3D benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMega Site Updates Per Second, More Is BetterPalabos 2.3Grid Size: 100Default130260390520650SE +/- 4.57, N = 3591.531. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm

OSPRay Studio

Intel OSPRay Studio is an open-source, interactive visualization and ray-tracing software package. OSPRay Studio makes use of Intel OSPRay, a portable ray-tracing engine for high-performance, high-fidelity visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 1 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path TracerDefault4K8K12K16K20KSE +/- 50.26, N = 3169531. (CXX) g++ options: -O3 -lm -ldl

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 3 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path TracerDefault4K8K12K16K20KSE +/- 50.67, N = 3202231. (CXX) g++ options: -O3 -lm -ldl

OSPRay

Intel OSPRay is a portable ray-tracing engine for high-performance, high-fidelity scientific visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.12Benchmark: particle_volume/ao/real_timeDefault612182430SE +/- 0.01, N = 325.16

Timed Godot Game Engine Compilation

This test times how long it takes to compile the Godot Game Engine. Godot is a popular, open-source, cross-platform 2D/3D game engine and is built using the SCons build system and targeting the X11 platform. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Godot Game Engine Compilation 4.0Time To CompileDefault20406080100SE +/- 0.06, N = 388.38

PyHPC Benchmarks

PyHPC-Benchmarks is a suite of Python high performance computing benchmarks for execution on CPUs and GPUs using various popular Python HPC libraries. The PyHPC CPU-based benchmarks focus on sequential CPU performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterPyHPC Benchmarks 3.0Device: CPU - Backend: Numpy - Project Size: 4194304 - Benchmark: Isoneutral MixingDefault0.35510.71021.06531.42041.7755SE +/- 0.001, N = 31.578

Zstd Compression

This test measures the time needed to compress/decompress a sample file (silesia.tar) using Zstd (Zstandard) compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19, Long Mode - Decompression SpeedDefault30060090012001500SE +/- 2.33, N = 31357.81. (CC) gcc options: -O3 -pthread -lz -llzma

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19, Long Mode - Compression SpeedDefault246810SE +/- 0.01, N = 38.521. (CC) gcc options: -O3 -pthread -lz -llzma

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 64 - Model: ResNet-50Default20406080100SE +/- 0.09, N = 394.50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 256 - Model: GoogLeNetDefault80160240320400SE +/- 0.40, N = 3387.68

Stockfish

This is a test of Stockfish, an advanced open-source C++11 chess benchmark that can scale up to 512 CPU threads. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 15Total TimeDefault60M120M180M240M300MSE +/- 1047576.89, N = 32972898921. (CXX) g++ options: -lgcov -m64 -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -msse -msse3 -mpopcnt -mavx2 -mavx512f -mavx512bw -mavx512vnni -mavx512dq -mavx512vl -msse4.1 -mssse3 -msse2 -mbmi2 -flto -flto=jobserver

Laghos

Laghos (LAGrangian High-Order Solver) is a miniapp that solves the time-dependent Euler equations of compressible gas dynamics in a moving Lagrangian frame using unstructured high-order finite element spatial discretization and explicit high-order time-stepping. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMajor Kernels Total Rate, More Is BetterLaghos 3.1Test: Sedov Blast Wave, ube_922_hex.meshDefault100200300400500SE +/- 1.52, N = 3446.931. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi

Zstd Compression

This test measures the time needed to compress/decompress a sample file (silesia.tar) using Zstd (Zstandard) compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19 - Decompression SpeedDefault30060090012001500SE +/- 6.29, N = 31433.41. (CC) gcc options: -O3 -pthread -lz -llzma

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19 - Compression SpeedDefault48121620SE +/- 0.03, N = 317.31. (CC) gcc options: -O3 -pthread -lz -llzma

OpenVINO

This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Person Detection FP32 - Device: CPUDefault400800120016002000SE +/- 6.69, N = 31766.54MIN: 966.02 / MAX: 2507.191. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Person Detection FP32 - Device: CPUDefault612182430SE +/- 0.10, N = 326.891. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Person Detection FP16 - Device: CPUDefault400800120016002000SE +/- 4.64, N = 31753.63MIN: 955.99 / MAX: 2502.091. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Person Detection FP16 - Device: CPUDefault612182430SE +/- 0.10, N = 327.111. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Face Detection FP16 - Device: CPUDefault2004006008001000SE +/- 0.35, N = 31003.51MIN: 498.16 / MAX: 1059.951. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Face Detection FP16 - Device: CPUDefault1122334455SE +/- 0.02, N = 347.621. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Face Detection FP16-INT8 - Device: CPUDefault110220330440550SE +/- 0.05, N = 3529.68MIN: 257.02 / MAX: 560.931. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Face Detection FP16-INT8 - Device: CPUDefault20406080100SE +/- 0.01, N = 390.361. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

Zstd Compression

This test measures the time needed to compress/decompress a sample file (silesia.tar) using Zstd (Zstandard) compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 12 - Decompression SpeedDefault400800120016002000SE +/- 2.05, N = 31653.31. (CC) gcc options: -O3 -pthread -lz -llzma

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 12 - Compression SpeedDefault70140210280350SE +/- 1.73, N = 3329.31. (CC) gcc options: -O3 -pthread -lz -llzma

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 3, Long Mode - Decompression SpeedDefault30060090012001500SE +/- 2.41, N = 31534.51. (CC) gcc options: -O3 -pthread -lz -llzma

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 3, Long Mode - Compression SpeedDefault2004006008001000SE +/- 11.65, N = 3889.91. (CC) gcc options: -O3 -pthread -lz -llzma

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 3 - Decompression SpeedDefault30060090012001500SE +/- 1.17, N = 31498.81. (CC) gcc options: -O3 -pthread -lz -llzma

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 3 - Compression SpeedDefault9001800270036004500SE +/- 22.49, N = 33989.71. (CC) gcc options: -O3 -pthread -lz -llzma

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 8, Long Mode - Decompression SpeedDefault400800120016002000SE +/- 0.64, N = 31648.11. (CC) gcc options: -O3 -pthread -lz -llzma

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 8, Long Mode - Compression SpeedDefault2004006008001000SE +/- 3.55, N = 3876.91. (CC) gcc options: -O3 -pthread -lz -llzma

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 8 - Decompression SpeedDefault400800120016002000SE +/- 2.25, N = 31639.91. (CC) gcc options: -O3 -pthread -lz -llzma

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 8 - Compression SpeedDefault30060090012001500SE +/- 1.56, N = 31226.71. (CC) gcc options: -O3 -pthread -lz -llzma

OpenVINO

This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Person Vehicle Bike Detection FP16 - Device: CPUDefault246810SE +/- 0.00, N = 38.36MIN: 4.99 / MAX: 31.021. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Person Vehicle Bike Detection FP16 - Device: CPUDefault12002400360048006000SE +/- 2.96, N = 35732.751. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Machine Translation EN To DE FP16 - Device: CPUDefault20406080100SE +/- 0.11, N = 389.50MIN: 39.21 / MAX: 143.981. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Machine Translation EN To DE FP16 - Device: CPUDefault120240360480600SE +/- 0.72, N = 3535.771. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Age Gender Recognition Retail 0013 FP16 - Device: CPUDefault0.1890.3780.5670.7560.945SE +/- 0.00, N = 30.84MIN: 0.3 / MAX: 12.711. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Age Gender Recognition Retail 0013 FP16 - Device: CPUDefault20K40K60K80K100KSE +/- 57.42, N = 394505.111. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Vehicle Detection FP16-INT8 - Device: CPUDefault246810SE +/- 0.00, N = 38.45MIN: 4.5 / MAX: 30.541. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Vehicle Detection FP16-INT8 - Device: CPUDefault12002400360048006000SE +/- 0.38, N = 35672.061. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUDefault0.29930.59860.89791.19721.4965SE +/- 0.00, N = 31.33MIN: 0.47 / MAX: 14.671. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUDefault14K28K42K56K70KSE +/- 63.50, N = 364461.931. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Weld Porosity Detection FP16-INT8 - Device: CPUDefault3691215SE +/- 0.00, N = 310.84MIN: 5.62 / MAX: 24.411. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Weld Porosity Detection FP16-INT8 - Device: CPUDefault2K4K6K8K10KSE +/- 0.12, N = 38846.671. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Vehicle Detection FP16 - Device: CPUDefault3691215SE +/- 0.00, N = 312.42MIN: 5.8 / MAX: 47.231. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Vehicle Detection FP16 - Device: CPUDefault8001600240032004000SE +/- 0.30, N = 33860.641. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Weld Porosity Detection FP16 - Device: CPUDefault3691215SE +/- 0.00, N = 310.28MIN: 5.38 / MAX: 28.561. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Weld Porosity Detection FP16 - Device: CPUDefault10002000300040005000SE +/- 0.19, N = 34663.971. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OSPRay

Intel OSPRay is a portable ray-tracing engine for high-performance, high-fidelity scientific visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.12Benchmark: gravity_spheres_volume/dim_512/scivis/real_timeDefault612182430SE +/- 0.04, N = 325.93

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.12Benchmark: gravity_spheres_volume/dim_512/ao/real_timeDefault612182430SE +/- 0.12, N = 326.79

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.12Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_timeDefault612182430SE +/- 0.02, N = 326.51

Neural Magic DeepSparse

This is a benchmark of Neural Magic's DeepSparse using its built-in deepsparse.benchmark utility and various models from their SparseZoo (https://sparsezoo.neuralmagic.com/). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-StreamDefault90180270360450SE +/- 0.35, N = 3417.86

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-StreamDefault306090120150SE +/- 0.07, N = 3114.18

Blender

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.6Blend File: Pabellon Barcelona - Compute: CPU-OnlyDefault1122334455SE +/- 0.08, N = 349.61

Neural Magic DeepSparse

This is a benchmark of Neural Magic's DeepSparse using its built-in deepsparse.benchmark utility and various models from their SparseZoo (https://sparsezoo.neuralmagic.com/). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-StreamDefault2004006008001000SE +/- 1.88, N = 3783.58

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-StreamDefault1428425670SE +/- 0.03, N = 360.63

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-StreamDefault2004006008001000SE +/- 0.16, N = 3785.21

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-StreamDefault1428425670SE +/- 0.02, N = 360.58

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 32 - Model: ResNet-50Default20406080100SE +/- 0.18, N = 378.11

Laghos

Laghos (LAGrangian High-Order Solver) is a miniapp that solves the time-dependent Euler equations of compressible gas dynamics in a moving Lagrangian frame using unstructured high-order finite element spatial discretization and explicit high-order time-stepping. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMajor Kernels Total Rate, More Is BetterLaghos 3.1Test: Triple Point ProblemDefault50100150200250SE +/- 1.65, N = 3238.891. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi

Neural Magic DeepSparse

This is a benchmark of Neural Magic's DeepSparse using its built-in deepsparse.benchmark utility and various models from their SparseZoo (https://sparsezoo.neuralmagic.com/). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-StreamDefault4080120160200SE +/- 0.04, N = 3185.85

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-StreamDefault60120180240300SE +/- 0.06, N = 3257.57

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 512 - Model: AlexNetDefault30060090012001500SE +/- 0.67, N = 31287.81

Neural Magic DeepSparse

This is a benchmark of Neural Magic's DeepSparse using its built-in deepsparse.benchmark utility and various models from their SparseZoo (https://sparsezoo.neuralmagic.com/). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-StreamDefault306090120150SE +/- 0.07, N = 3145.05

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-StreamDefault70140210280350SE +/- 0.25, N = 3329.98

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-StreamDefault918273645SE +/- 0.03, N = 339.31

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-StreamDefault30060090012001500SE +/- 0.89, N = 31219.50

Blender

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.6Blend File: Classroom - Compute: CPU-OnlyDefault918273645SE +/- 0.11, N = 340.51

Neural Magic DeepSparse

This is a benchmark of Neural Magic's DeepSparse using its built-in deepsparse.benchmark utility and various models from their SparseZoo (https://sparsezoo.neuralmagic.com/). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-StreamDefault20406080100SE +/- 0.02, N = 393.32

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-StreamDefault110220330440550SE +/- 0.30, N = 3513.19

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-StreamDefault1326395265SE +/- 0.03, N = 360.06

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-StreamDefault2004006008001000SE +/- 0.46, N = 3797.95

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-StreamDefault306090120150SE +/- 0.13, N = 3139.02

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-StreamDefault70140210280350SE +/- 0.41, N = 3344.41

Timed Linux Kernel Compilation

This test times how long it takes to build the Linux kernel in a default configuration (defconfig) for the architecture being tested or alternatively an allmodconfig for building all possible kernel modules for the build. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 6.1Build: defconfigDefault510152025SE +/- 0.25, N = 522.92

GPAW

GPAW is a density-functional theory (DFT) Python code based on the projector-augmented wave (PAW) method and the atomic simulation environment (ASE). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterGPAW 23.6Input: Carbon NanotubeDefault816243240SE +/- 0.18, N = 334.771. (CC) gcc options: -shared -fwrapv -O2 -lxc -lblas -lmpi

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 16 - Model: ResNet-50Default1326395265SE +/- 0.21, N = 357.21

7-Zip Compression

This is a test of 7-Zip compression/decompression with its integrated benchmark feature. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 22.01Test: Decompression RatingDefault130K260K390K520K650KSE +/- 1050.05, N = 36264721. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 22.01Test: Compression RatingDefault140K280K420K560K700KSE +/- 469.85, N = 36470631. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

Timed PHP Compilation

This test times how long it takes to build PHP. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed PHP Compilation 8.1.9Time To CompileDefault816243240SE +/- 0.27, N = 333.45

Stress-NG

Stress-NG is a Linux stress tool developed by Colin Ian King. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: CPU CacheDefault300K600K900K1200K1500KSE +/- 11233.16, N = 31397574.751. (CXX) g++ options: -O2 -std=gnu99 -lc

srsRAN Project

srsRAN Project is a complete ORAN-native 5G RAN solution created by Software Radio Systems (SRS). The srsRAN Project radio suite was formerly known as srsLTE and can be used for building your own software-defined radio (SDR) 4G/5G mobile network. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMbps, More Is BettersrsRAN Project 23.5Test: PUSCH Processor Benchmark, Throughput TotalDefault4K8K12K16K20KSE +/- 23.08, N = 318408.11. (CXX) g++ options: -march=native -mfma -O3 -fno-trapping-math -fno-math-errno -lgtest

Stress-NG

Stress-NG is a Linux stress tool developed by Colin Ian King. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: AtomicDefault50100150200250SE +/- 0.05, N = 3237.831. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: MMAPDefault30060090012001500SE +/- 0.78, N = 31444.921. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: PthreadDefault20K40K60K80K100KSE +/- 525.98, N = 3103280.791. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: MEMFDDefault100200300400500SE +/- 0.83, N = 3454.751. (CXX) g++ options: -O2 -std=gnu99 -lc

Liquid-DSP

LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 192 - Buffer Length: 256 - Filter Length: 512Default300M600M900M1200M1500MSE +/- 762306.44, N = 312878666671. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Stress-NG

Stress-NG is a Linux stress tool developed by Colin Ian King. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: MallocDefault80M160M240M320M400MSE +/- 494466.73, N = 3360348999.651. (CXX) g++ options: -O2 -std=gnu99 -lc

Liquid-DSP

LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 128 - Buffer Length: 256 - Filter Length: 512Default200M400M600M800M1000MSE +/- 953939.20, N = 310811000001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Stress-NG

Stress-NG is a Linux stress tool developed by Colin Ian King. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: ZlibDefault2K4K6K8K10KSE +/- 2.81, N = 310471.171. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Matrix 3D MathDefault4K8K12K16K20KSE +/- 154.68, N = 316595.281. (CXX) g++ options: -O2 -std=gnu99 -lc

Liquid-DSP

LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 512Default160M320M480M640M800MSE +/- 2197592.12, N = 37275233331. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Stress-NG

Stress-NG is a Linux stress tool developed by Colin Ian King. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Glibc Qsort Data SortingDefault5001000150020002500SE +/- 0.52, N = 32108.301. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Vector MathDefault120K240K360K480K600KSE +/- 61.16, N = 3545725.731. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: ForkingDefault9K18K27K36K45KSE +/- 70.57, N = 340400.831. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: NUMADefault5001000150020002500SE +/- 6.42, N = 32478.541. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: FutexDefault900K1800K2700K3600K4500KSE +/- 13786.79, N = 33985709.561. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Glibc C String FunctionsDefault20M40M60M80M100MSE +/- 512163.22, N = 381208060.171. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: PollDefault3M6M9M12M15MSE +/- 7203.48, N = 313269096.621. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Wide Vector MathDefault700K1400K2100K2800K3500KSE +/- 964.27, N = 33485374.331. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: AVL TreeDefault400800120016002000SE +/- 0.40, N = 31665.571. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Memory CopyingDefault7K14K21K28K35KSE +/- 1.22, N = 332994.181. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Floating PointDefault6K12K18K24K30KSE +/- 16.24, N = 329883.931. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Function CallDefault14K28K42K56K70KSE +/- 29.99, N = 367313.391. (CXX) g++ options: -O2 -std=gnu99 -lc

Liquid-DSP

LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 192 - Buffer Length: 256 - Filter Length: 57Default1000M2000M3000M4000M5000MSE +/- 3001296.02, N = 346346333331. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Stress-NG

Stress-NG is a Linux stress tool developed by Colin Ian King. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Fused Multiply-AddDefault16M32M48M64M80MSE +/- 17500.16, N = 376566577.121. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: CryptoDefault40K80K120K160K200KSE +/- 436.93, N = 3202368.611. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: PipeDefault13M26M39M52M65MSE +/- 612854.45, N = 360302912.671. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: HashDefault4M8M12M16M20MSE +/- 3937.95, N = 318746640.551. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Vector Floating PointDefault60K120K180K240K300KSE +/- 151.27, N = 3257925.611. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Vector ShuffleDefault14K28K42K56K70KSE +/- 3.12, N = 363761.291. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: MutexDefault11M22M33M44M55MSE +/- 152725.58, N = 349736118.061. (CXX) g++ options: -O2 -std=gnu99 -lc

Liquid-DSP

LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 192 - Buffer Length: 256 - Filter Length: 32Default1100M2200M3300M4400M5500MSE +/- 202758.75, N = 349585333331. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 128 - Buffer Length: 256 - Filter Length: 57Default800M1600M2400M3200M4000MSE +/- 4532230.26, N = 338767333331. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Stress-NG

Stress-NG is a Linux stress tool developed by Colin Ian King. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: SemaphoresDefault50M100M150M200M250MSE +/- 2019374.88, N = 3223213939.861. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: CPU StressDefault50K100K150K200K250KSE +/- 233.96, N = 3212380.761. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: System V Message PassingDefault3M6M9M12M15MSE +/- 5434.16, N = 312085427.221. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Matrix MathDefault90K180K270K360K450KSE +/- 67.17, N = 3418033.131. (CXX) g++ options: -O2 -std=gnu99 -lc

Liquid-DSP

LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 128 - Buffer Length: 256 - Filter Length: 32Default800M1600M2400M3200M4000MSE +/- 1471205.10, N = 337687333331. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 57Default600M1200M1800M2400M3000MSE +/- 11597461.41, N = 326094666671. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 32Default500M1000M1500M2000M2500MSE +/- 995545.63, N = 321812666671. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

ASKAP

ASKAP is a set of benchmarks from the Australian SKA Pathfinder. The principal ASKAP benchmarks are the Hogbom Clean Benchmark (tHogbomClean) and Convolutional Resamping Benchmark (tConvolve) as well as some previous ASKAP benchmarks being included as well for OpenCL and CUDA execution of tConvolve. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMpix/sec, More Is BetterASKAP 1.0Test: tConvolve MPI - GriddingDefault16K32K48K64K80KSE +/- 0.00, N = 373226.71. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

OpenBenchmarking.orgMpix/sec, More Is BetterASKAP 1.0Test: tConvolve MPI - DegriddingDefault13K26K39K52K65KSE +/- 380.83, N = 359791.11. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512Default1530456075SE +/- 0.10, N = 367.851. (CXX) g++ options: -O3

NAMD

NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD was developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgdays/ns, Fewer Is BetterNAMD 2.14ATPase Simulation - 327,506 AtomsDefault0.05560.11120.16680.22240.278SE +/- 0.00040, N = 30.24733

PyHPC Benchmarks

PyHPC-Benchmarks is a suite of Python high performance computing benchmarks for execution on CPUs and GPUs using various popular Python HPC libraries. The PyHPC CPU-based benchmarks focus on sequential CPU performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterPyHPC Benchmarks 3.0Device: CPU - Backend: Numpy - Project Size: 4194304 - Benchmark: Equation of StateDefault0.17260.34520.51780.69040.863SE +/- 0.004, N = 30.767

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 64 - Model: GoogLeNetDefault60120180240300SE +/- 0.03, N = 3296.19

Algebraic Multi-Grid Benchmark

AMG is a parallel algebraic multigrid solver for linear systems arising from problems on unstructured grids. The driver provided with AMG builds linear systems for various 3-dimensional problems. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFigure Of Merit, More Is BetterAlgebraic Multi-Grid Benchmark 1.2Default500M1000M1500M2000M2500MSE +/- 4643811.51, N = 324142830001. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -lmpi

GROMACS

The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2023Implementation: MPI CPU - Input: water_GMX50_bareDefault3691215SE +/- 0.01, N = 311.791. (CXX) g++ options: -O3

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 256 - Model: AlexNetDefault30060090012001500SE +/- 0.47, N = 31196.16

SPECFEM3D

simulates acoustic (fluid), elastic (solid), coupled acoustic/elastic, poroelastic or seismic wave propagation in any type of conforming mesh of hexahedra. This test profile currently relies on CPU-based execution for SPECFEM3D and using a variety of their built-in examples/models for benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.0Model: Water-layered HalfspaceDefault510152025SE +/- 0.19, N = 321.121. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.0Model: Layered HalfspaceDefault510152025SE +/- 0.10, N = 321.351. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

Blender

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.6Blend File: Fishy Cat - Compute: CPU-OnlyDefault510152025SE +/- 0.09, N = 320.62

Remhos

Remhos (REMap High-Order Solver) is a miniapp that solves the pure advection equations that are used to perform monotonic and conservative discontinuous field interpolation (remap) as part of the Eulerian phase in Arbitrary Lagrangian Eulerian (ALE) simulations. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRemhos 1.0Test: Sample Remap ExampleDefault3691215SE +/- 0.10, N = 610.291. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi

LULESH

LULESH is the Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgz/s, More Is BetterLULESH 2.0.3Default7K14K21K28K35KSE +/- 182.94, N = 430715.331. (CXX) g++ options: -O3 -fopenmp -lm -lmpi_cxx -lmpi

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512Default306090120150SE +/- 0.58, N = 4134.821. (CXX) g++ options: -O3

Xmrig

Xmrig is an open-source cross-platform CPU/GPU miner for RandomX, KawPow, CryptoNight and AstroBWT. This test profile is setup to measure the Xmlrig CPU mining performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterXmrig 6.18.1Variant: Monero - Hash Count: 1MDefault15K30K45K60K75KSE +/- 83.67, N = 469684.31. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc

srsRAN Project

srsRAN Project is a complete ORAN-native 5G RAN solution created by Software Radio Systems (SRS). The srsRAN Project radio suite was formerly known as srsLTE and can be used for building your own software-defined radio (SDR) 4G/5G mobile network. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMbps, More Is BettersrsRAN Project 23.5Test: Downlink Processor BenchmarkDefault160320480640800SE +/- 6.73, N = 3721.31. (CXX) g++ options: -march=native -mfma -O3 -fno-trapping-math -fno-math-errno -lgtest

OpenFOAM

OpenFOAM is the leading free, open-source software for computational fluid dynamics (CFD). This test profile currently uses the drivaerFastback test case for analyzing automotive aerodynamics or alternatively the older motorBike input. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Small Mesh Size - Execution TimeDefault71421283529.881. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Small Mesh Size - Mesh TimeDefault61218243022.971. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

Xmrig

Xmrig is an open-source cross-platform CPU/GPU miner for RandomX, KawPow, CryptoNight and AstroBWT. This test profile is setup to measure the Xmlrig CPU mining performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterXmrig 6.18.1Variant: Wownero - Hash Count: 1MDefault16K32K48K64K80KSE +/- 23.08, N = 474009.71. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc

Embree

Intel Embree is a collection of high-performance ray-tracing kernels for execution on CPUs (and GPUs via SYCL) and supporting instruction sets such as SSE, AVX, AVX2, and AVX-512. Embree also supports making use of the Intel SPMD Program Compiler (ISPC). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer ISPC - Model: Asian Dragon ObjDefault306090120150SE +/- 0.05, N = 4123.51MIN: 121.61 / MAX: 126.87

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 16 - Model: GoogLeNetDefault306090120150SE +/- 0.40, N = 4156.94

SPECFEM3D

simulates acoustic (fluid), elastic (solid), coupled acoustic/elastic, poroelastic or seismic wave propagation in any type of conforming mesh of hexahedra. This test profile currently relies on CPU-based execution for SPECFEM3D and using a variety of their built-in examples/models for benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.0Model: Tomographic ModelDefault246810SE +/- 0.026518372, N = 58.7389035851. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.0Model: Mount St. HelensDefault246810SE +/- 0.087169113, N = 58.8981452401. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.0Model: Homogeneous HalfspaceDefault3691215SE +/- 0.14, N = 411.321. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

NAS Parallel Benchmarks

NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: EP.DDefault2K4K6K8K10KSE +/- 40.63, N = 410697.901. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

Monte Carlo Simulations of Ionised Nebulae

Mocassin is the Monte Carlo Simulations of Ionised Nebulae. MOCASSIN is a fully 3D or 2D photoionisation and dust radiative transfer code which employs a Monte Carlo approach to the transfer of radiation through media of arbitrary geometry and density distribution. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterMonte Carlo Simulations of Ionised Nebulae 2.02.73.3Input: Gas HII40Default3691215SE +/- 0.03, N = 413.281. (F9X) gfortran options: -cpp -Jsource/ -ffree-line-length-0 -lm -std=legacy -O2 -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lz

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512Default306090120150SE +/- 0.85, N = 4153.841. (CXX) g++ options: -O3

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 64 - Model: AlexNetDefault2004006008001000SE +/- 2.22, N = 5786.60

ASKAP

ASKAP is a set of benchmarks from the Australian SKA Pathfinder. The principal ASKAP benchmarks are the Hogbom Clean Benchmark (tHogbomClean) and Convolutional Resamping Benchmark (tConvolve) as well as some previous ASKAP benchmarks being included as well for OpenCL and CUDA execution of tConvolve. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgIterations Per Second, More Is BetterASKAP 1.0Test: Hogbom Clean OpenMPDefault30060090012001500SE +/- 4.24, N = 41212.171. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 32 - Model: GoogLeNetDefault50100150200250SE +/- 0.31, N = 3238.18

CloverLeaf

CloverLeaf is a Lagrangian-Eulerian hydrodynamics benchmark. This test profile currently makes use of CloverLeaf's OpenMP version and benchmarked with the clover_bm.in input file (Problem 5). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterCloverLeafLagrangian-Eulerian HydrodynamicsDefault3691215SE +/- 0.04, N = 510.291. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp

Blender

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.6Blend File: BMW27 - Compute: CPU-OnlyDefault48121620SE +/- 0.10, N = 316.28

NAS Parallel Benchmarks

NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: BT.CDefault70K140K210K280K350KSE +/- 1672.02, N = 5314777.901. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

ASTC Encoder

ASTC Encoder (astcenc) is for the Adaptive Scalable Texture Compression (ASTC) format commonly used with OpenGL, OpenGL ES, and Vulkan graphics APIs. This test profile does a coding test of both compression/decompression. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 4.0Preset: ThoroughDefault1326395265SE +/- 0.02, N = 656.781. (CXX) g++ options: -O3 -flto -pthread

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 4.0Preset: ExhaustiveDefault246810SE +/- 0.0017, N = 46.14111. (CXX) g++ options: -O3 -flto -pthread

NAS Parallel Benchmarks

NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: SP.CDefault40K80K120K160K200KSE +/- 1595.09, N = 6207614.701. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

miniFE

MiniFE Finite Element is an application for unstructured implicit finite element codes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgCG Mflops, More Is BetterminiFE 2.2Problem Size: SmallDefault12K24K36K48K60KSE +/- 227.90, N = 554408.21. (CXX) g++ options: -O3 -fopenmp -lmpi_cxx -lmpi

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 16 - Model: AlexNetDefault70140210280350SE +/- 0.49, N = 6303.80

Xcompact3d Incompact3d

Xcompact3d Incompact3d is a Fortran-MPI based, finite difference high-performance code for solving the incompressible Navier-Stokes equation and as many as you need scalar transport equations. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 129 Cells Per DirectionDefault0.48940.97881.46821.95762.447SE +/- 0.04209465, N = 152.175216131. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

Google Draco

Draco is a library developed by Google for compressing/decompressing 3D geometric meshes and point clouds. This test profile uses some Artec3D PLY models as the sample 3D model input formats for Draco compression/decompression. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterGoogle Draco 1.5.6Model: LionDefault11002200330044005500SE +/- 2.98, N = 750111. (CXX) g++ options: -O3

OpenBenchmarking.orgms, Fewer Is BetterGoogle Draco 1.5.6Model: Church FacadeDefault13002600390052006500SE +/- 3.90, N = 660431. (CXX) g++ options: -O3

ASKAP

ASKAP is a set of benchmarks from the Australian SKA Pathfinder. The principal ASKAP benchmarks are the Hogbom Clean Benchmark (tHogbomClean) and Convolutional Resamping Benchmark (tConvolve) as well as some previous ASKAP benchmarks being included as well for OpenCL and CUDA execution of tConvolve. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve OpenMP - DegriddingDefault12K24K36K48K60KSE +/- 1901.83, N = 755153.01. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve OpenMP - GriddingDefault6K12K18K24K30KSE +/- 0.00, N = 726625.61. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

NAS Parallel Benchmarks

NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: IS.DDefault12002400360048006000SE +/- 29.57, N = 55696.511. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

Xcompact3d Incompact3d

Xcompact3d Incompact3d is a Fortran-MPI based, finite difference high-performance code for solving the incompressible Navier-Stokes equation and as many as you need scalar transport equations. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 193 Cells Per DirectionDefault246810SE +/- 0.06675033, N = 57.685286811. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 32 - Model: AlexNetDefault110220330440550SE +/- 0.63, N = 5518.15

Embree

Intel Embree is a collection of high-performance ray-tracing kernels for execution on CPUs (and GPUs via SYCL) and supporting instruction sets such as SSE, AVX, AVX2, and AVX-512. Embree also supports making use of the Intel SPMD Program Compiler (ISPC). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer ISPC - Model: CrownDefault306090120150SE +/- 0.08, N = 7117.83MIN: 114.92 / MAX: 122.62

ACES DGEMM

This is a multi-threaded DGEMM benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterACES DGEMM 1.0Sustained Floating-Point RateDefault918273645SE +/- 0.13, N = 740.551. (CC) gcc options: -O3 -march=native -fopenmp

NAS Parallel Benchmarks

NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: LU.CDefault70K140K210K280K350KSE +/- 1381.97, N = 6337910.641. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512Default70140210280350SE +/- 1.86, N = 6332.741. (CXX) g++ options: -O3

libxsmm

Libxsmm is an open-source library for specialized dense and sparse matrix operations and deep learning primitives. Libxsmm supports making use of Intel AMX, AVX-512, and other modern CPU instruction set capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 32Default30060090012001500SE +/- 1.58, N = 81311.31. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

ASTC Encoder

ASTC Encoder (astcenc) is for the Adaptive Scalable Texture Compression (ASTC) format commonly used with OpenGL, OpenGL ES, and Vulkan graphics APIs. This test profile does a coding test of both compression/decompression. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 4.0Preset: MediumDefault90180270360450SE +/- 0.05, N = 8419.431. (CXX) g++ options: -O3 -flto -pthread

libxsmm

Libxsmm is an open-source library for specialized dense and sparse matrix operations and deep learning primitives. Libxsmm supports making use of Intel AMX, AVX-512, and other modern CPU instruction set capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 64Default5001000150020002500SE +/- 4.53, N = 72455.41. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

Embree

Intel Embree is a collection of high-performance ray-tracing kernels for execution on CPUs (and GPUs via SYCL) and supporting instruction sets such as SSE, AVX, AVX2, and AVX-512. Embree also supports making use of the Intel SPMD Program Compiler (ISPC). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer ISPC - Model: Asian DragonDefault306090120150SE +/- 0.10, N = 7144.00MIN: 141.47 / MAX: 149.12

NAS Parallel Benchmarks

NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: FT.CDefault30K60K90K120K150KSE +/- 329.27, N = 8118831.891. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: CG.CDefault13K26K39K52K65KSE +/- 470.01, N = 1059737.941. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: MG.CDefault30K60K90K120K150KSE +/- 1630.26, N = 15137308.271. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256Default20406080100SE +/- 0.52, N = 987.041. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256Default4080120160200SE +/- 1.48, N = 15190.131. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256Default4080120160200SE +/- 1.06, N = 11178.441. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256Default70140210280350SE +/- 1.00, N = 12315.811. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128Default20406080100SE +/- 0.30, N = 1379.921. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128Default306090120150SE +/- 0.54, N = 13129.881. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128Default306090120150SE +/- 0.76, N = 13126.031. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128Default4080120160200SE +/- 0.43, N = 13185.391. (CXX) g++ options: -O3

Kripke

Kripke is a simple, scalable, 3D Sn deterministic particle transport code. Its primary purpose is to research how data layout, programming paradigms and architectures effect the implementation and performance of Sn transport. Kripke is developed by LLNL. Learn more via the OpenBenchmarking.org test page.

Default: The test quit with a non-zero exit status.

PyHPC Benchmarks

PyHPC-Benchmarks is a suite of Python high performance computing benchmarks for execution on CPUs and GPUs using various popular Python HPC libraries. The PyHPC CPU-based benchmarks focus on sequential CPU performance. Learn more via the OpenBenchmarking.org test page.

Device: CPU - Backend: TensorFlow - Project Size: 4194304 - Benchmark: Equation of State

Default: The test run did not produce a result.

Device: CPU - Backend: PyTorch - Project Size: 4194304 - Benchmark: Isoneutral Mixing

Default: The test run did not produce a result.

Device: CPU - Backend: Aesara - Project Size: 4194304 - Benchmark: Isoneutral Mixing

Default: The test run did not produce a result.

Device: CPU - Backend: Aesara - Project Size: 4194304 - Benchmark: Equation of State

Default: The test run did not produce a result.

Device: CPU - Backend: Numba - Project Size: 4194304 - Benchmark: Isoneutral Mixing

Default: The test run did not produce a result.

Device: CPU - Backend: JAX - Project Size: 4194304 - Benchmark: Isoneutral Mixing

Default: The test run did not produce a result.

Device: CPU - Backend: JAX - Project Size: 4194304 - Benchmark: Equation of State

Default: The test run did not produce a result.

Device: CPU - Backend: PyTorch - Project Size: 4194304 - Benchmark: Equation of State

Default: The test run did not produce a result.

Device: CPU - Backend: Numba - Project Size: 4194304 - Benchmark: Equation of State

Default: The test run did not produce a result.

Device: CPU - Backend: TensorFlow - Project Size: 4194304 - Benchmark: Isoneutral Mixing

Default: The test run did not produce a result.

Stress-NG

Stress-NG is a Linux stress tool developed by Colin Ian King. Learn more via the OpenBenchmarking.org test page.

Test: x86_64 RdRand

Default: The test run did not produce a result. E: stress-ng: error: [1283964] No stress workers invoked (one or more were unsupported)

238 Results Shown

WRF
OpenFOAM:
  drivaerFastback, Large Mesh Size - Execution Time
  drivaerFastback, Large Mesh Size - Mesh Time
PETSc
High Performance Conjugate Gradient:
  144 144 144 - 60
  192 192 192 - 60
Whisper.cpp
libxsmm
High Performance Conjugate Gradient:
  160 160 160 - 60
  104 104 104 - 60
Whisper.cpp
TensorFlow
Palabos
LeelaChessZero
TensorFlow:
  CPU - 512 - GoogLeNet
  CPU - 512 - ResNet-50
LeelaChessZero
Whisper.cpp
ASKAP:
  tConvolve MT - Degridding
  tConvolve MT - Gridding
libxsmm
LAMMPS Molecular Dynamics Simulator
Timed Linux Kernel Compilation
Monte Carlo Simulations of Ionised Nebulae
Timed LLVM Compilation
Palabos:
  500
  1000
OSPRay
Numpy Benchmark
Stress-NG:
  Socket Activity
  Context Switching
  IO_uring
Blender
OSPRay
Timed Gem5 Compilation
OSPRay Studio
asmFish
Stress-NG:
  Cloning
  SENDFILE
Ngspice
OSPRay Studio:
  2 - 4K - 32 - Path Tracer
  1 - 4K - 32 - Path Tracer
  3 - 4K - 1 - Path Tracer
Timed LLVM Compilation
OSPRay Studio:
  2 - 4K - 1 - Path Tracer
  1 - 4K - 1 - Path Tracer
Timed Node.js Compilation
Ngspice
OpenFOAM:
  drivaerFastback, Medium Mesh Size - Execution Time
  drivaerFastback, Medium Mesh Size - Mesh Time
OSPRay Studio
Palabos
OSPRay Studio:
  1 - 4K - 16 - Path Tracer
  3 - 4K - 16 - Path Tracer
OSPRay
Timed Godot Game Engine Compilation
PyHPC Benchmarks
Zstd Compression:
  19, Long Mode - Decompression Speed
  19, Long Mode - Compression Speed
TensorFlow:
  CPU - 64 - ResNet-50
  CPU - 256 - GoogLeNet
Stockfish
Laghos
Zstd Compression:
  19 - Decompression Speed
  19 - Compression Speed
OpenVINO:
  Person Detection FP32 - CPU:
    ms
    FPS
  Person Detection FP16 - CPU:
    ms
    FPS
  Face Detection FP16 - CPU:
    ms
    FPS
  Face Detection FP16-INT8 - CPU:
    ms
    FPS
Zstd Compression:
  12 - Decompression Speed
  12 - Compression Speed
  3, Long Mode - Decompression Speed
  3, Long Mode - Compression Speed
  3 - Decompression Speed
  3 - Compression Speed
  8, Long Mode - Decompression Speed
  8, Long Mode - Compression Speed
  8 - Decompression Speed
  8 - Compression Speed
OpenVINO:
  Person Vehicle Bike Detection FP16 - CPU:
    ms
    FPS
  Machine Translation EN To DE FP16 - CPU:
    ms
    FPS
  Age Gender Recognition Retail 0013 FP16 - CPU:
    ms
    FPS
  Vehicle Detection FP16-INT8 - CPU:
    ms
    FPS
  Age Gender Recognition Retail 0013 FP16-INT8 - CPU:
    ms
    FPS
  Weld Porosity Detection FP16-INT8 - CPU:
    ms
    FPS
  Vehicle Detection FP16 - CPU:
    ms
    FPS
  Weld Porosity Detection FP16 - CPU:
    ms
    FPS
OSPRay:
  gravity_spheres_volume/dim_512/scivis/real_time
  gravity_spheres_volume/dim_512/ao/real_time
  gravity_spheres_volume/dim_512/pathtracer/real_time
Neural Magic DeepSparse:
  CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream:
    ms/batch
    items/sec
Blender
Neural Magic DeepSparse:
  NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream:
    ms/batch
    items/sec
  NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream:
    ms/batch
    items/sec
TensorFlow
Laghos
Neural Magic DeepSparse:
  NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream:
    ms/batch
    items/sec
TensorFlow
Neural Magic DeepSparse:
  NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream:
    ms/batch
    items/sec
  NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream:
    ms/batch
    items/sec
Blender
Neural Magic DeepSparse:
  NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream:
    ms/batch
    items/sec
  CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream:
    ms/batch
    items/sec
  CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream:
    ms/batch
    items/sec
Timed Linux Kernel Compilation
GPAW
TensorFlow
7-Zip Compression:
  Decompression Rating
  Compression Rating
Timed PHP Compilation
Stress-NG
srsRAN Project
Stress-NG:
  Atomic
  MMAP
  Pthread
  MEMFD
Liquid-DSP
Stress-NG
Liquid-DSP
Stress-NG:
  Zlib
  Matrix 3D Math
Liquid-DSP
Stress-NG:
  Glibc Qsort Data Sorting
  Vector Math
  Forking
  NUMA
  Futex
  Glibc C String Functions
  Poll
  Wide Vector Math
  AVL Tree
  Memory Copying
  Floating Point
  Function Call
Liquid-DSP
Stress-NG:
  Fused Multiply-Add
  Crypto
  Pipe
  Hash
  Vector Floating Point
  Vector Shuffle
  Mutex
Liquid-DSP:
  192 - 256 - 32
  128 - 256 - 57
Stress-NG:
  Semaphores
  CPU Stress
  System V Message Passing
  Matrix Math
Liquid-DSP:
  128 - 256 - 32
  64 - 256 - 57
  64 - 256 - 32
ASKAP:
  tConvolve MPI - Gridding
  tConvolve MPI - Degridding
HeFFTe - Highly Efficient FFT for Exascale
NAMD
PyHPC Benchmarks
TensorFlow
Algebraic Multi-Grid Benchmark
GROMACS
TensorFlow
SPECFEM3D:
  Water-layered Halfspace
  Layered Halfspace
Blender
Remhos
LULESH
HeFFTe - Highly Efficient FFT for Exascale
Xmrig
srsRAN Project
OpenFOAM:
  drivaerFastback, Small Mesh Size - Execution Time
  drivaerFastback, Small Mesh Size - Mesh Time
Xmrig
Embree
TensorFlow
SPECFEM3D:
  Tomographic Model
  Mount St. Helens
  Homogeneous Halfspace
NAS Parallel Benchmarks
Monte Carlo Simulations of Ionised Nebulae
HeFFTe - Highly Efficient FFT for Exascale
TensorFlow
ASKAP
TensorFlow
CloverLeaf
Blender
NAS Parallel Benchmarks
ASTC Encoder:
  Thorough
  Exhaustive
NAS Parallel Benchmarks
miniFE
TensorFlow
Xcompact3d Incompact3d
Google Draco:
  Lion
  Church Facade
ASKAP:
  tConvolve OpenMP - Degridding
  tConvolve OpenMP - Gridding
NAS Parallel Benchmarks
Xcompact3d Incompact3d
TensorFlow
Embree
ACES DGEMM
NAS Parallel Benchmarks
HeFFTe - Highly Efficient FFT for Exascale
libxsmm
ASTC Encoder
libxsmm
Embree
NAS Parallel Benchmarks:
  FT.C
  CG.C
  MG.C
HeFFTe - Highly Efficient FFT for Exascale:
  c2c - FFTW - double - 256
  r2c - FFTW - double - 256
  c2c - FFTW - float - 256
  r2c - FFTW - float - 256
  c2c - FFTW - double - 128
  r2c - FFTW - double - 128
  c2c - FFTW - float - 128
  r2c - FFTW - float - 128