AMD EPYC 9684X 3D V-Cache

AMD EPYC 9684X 96-Core testing with a AMD Titanite_4G (RTI1007B BIOS) and ASPEED on Ubuntu 22.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2307207-NE-UPLOAD92587&grr.

AMD EPYC 9684X 3D V-CacheProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelDesktopDisplay ServerVulkanCompilerFile-SystemScreen ResolutionDefaultAMD EPYC 9684X 96-Core @ 2.55GHz (96 Cores / 192 Threads)AMD Titanite_4G (RTI1007B BIOS)AMD Device 14a4768GB2 x 1920GB SAMSUNG MZWLJ1T9HBJR-00007ASPEEDBroadcom NetXtreme BCM5720 PCIeUbuntu 22.045.19.0-41-generic (x86_64)GNOME Shell 42.5X Server 1.21.1.41.3.224GCC 11.3.0ext41024x768OpenBenchmarking.org- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa101121 - Python 3.10.6- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

AMD EPYC 9684X 3D V-Cachewrf: conus 2.5kmopenfoam: drivaerFastback, Large Mesh Size - Execution Timeopenfoam: drivaerFastback, Large Mesh Size - Mesh Timepetsc: Streamshpcg: 144 144 144 - 60hpcg: 192 192 192 - 60whisper-cpp: ggml-medium.en - 2016 State of the Unionlibxsmm: 128hpcg: 160 160 160 - 60hpcg: 104 104 104 - 60whisper-cpp: ggml-small.en - 2016 State of the Uniontensorflow: CPU - 256 - ResNet-50palabos: 400lczero: BLAStensorflow: CPU - 512 - GoogLeNettensorflow: CPU - 512 - ResNet-50lczero: Eigenwhisper-cpp: ggml-base.en - 2016 State of the Unionaskap: tConvolve MT - Degriddingaskap: tConvolve MT - Griddinglibxsmm: 256lammps: 20k Atomsbuild-linux-kernel: allmodconfigmocassin: Dust 2D tau100.0build-llvm: Unix Makefilespalabos: 500palabos: 1000ospray: particle_volume/pathtracer/real_timenumpy: stress-ng: Socket Activitystress-ng: Context Switchingstress-ng: IO_uringblender: Barbershop - CPU-Onlyospray: particle_volume/scivis/real_timebuild-gem5: Time To Compileospray-studio: 3 - 4K - 32 - Path Tracerasmfish: 1024 Hash Memory, 26 Depthstress-ng: Cloningstress-ng: SENDFILEngspice: C2670ospray-studio: 2 - 4K - 32 - Path Tracerospray-studio: 1 - 4K - 32 - Path Tracerospray-studio: 3 - 4K - 1 - Path Tracerbuild-llvm: Ninjaospray-studio: 2 - 4K - 1 - Path Tracerospray-studio: 1 - 4K - 1 - Path Tracerbuild-nodejs: Time To Compilengspice: C7552openfoam: drivaerFastback, Medium Mesh Size - Execution Timeopenfoam: drivaerFastback, Medium Mesh Size - Mesh Timeospray-studio: 2 - 4K - 16 - Path Tracerpalabos: 100ospray-studio: 1 - 4K - 16 - Path Tracerospray-studio: 3 - 4K - 16 - Path Tracerospray: particle_volume/ao/real_timebuild-godot: Time To Compilepyhpc: CPU - Numpy - 4194304 - Isoneutral Mixingcompress-zstd: 19, Long Mode - Decompression Speedcompress-zstd: 19, Long Mode - Compression Speedtensorflow: CPU - 64 - ResNet-50tensorflow: CPU - 256 - GoogLeNetstockfish: Total Timelaghos: Sedov Blast Wave, ube_922_hex.meshcompress-zstd: 19 - Decompression Speedcompress-zstd: 19 - Compression Speedopenvino: Person Detection FP32 - CPUopenvino: Person Detection FP32 - CPUopenvino: Person Detection FP16 - CPUopenvino: Person Detection FP16 - CPUopenvino: Face Detection FP16 - CPUopenvino: Face Detection FP16 - CPUopenvino: Face Detection FP16-INT8 - CPUopenvino: Face Detection FP16-INT8 - CPUcompress-zstd: 12 - Decompression Speedcompress-zstd: 12 - Compression Speedcompress-zstd: 3, Long Mode - Decompression Speedcompress-zstd: 3, Long Mode - Compression Speedcompress-zstd: 3 - Decompression Speedcompress-zstd: 3 - Compression Speedcompress-zstd: 8, Long Mode - Decompression Speedcompress-zstd: 8, Long Mode - Compression Speedcompress-zstd: 8 - Decompression Speedcompress-zstd: 8 - Compression Speedopenvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Machine Translation EN To DE FP16 - CPUopenvino: Machine Translation EN To DE FP16 - CPUopenvino: Age Gender Recognition Retail 0013 FP16 - CPUopenvino: Age Gender Recognition Retail 0013 FP16 - CPUopenvino: Vehicle Detection FP16-INT8 - CPUopenvino: Vehicle Detection FP16-INT8 - CPUopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUopenvino: Weld Porosity Detection FP16-INT8 - CPUopenvino: Weld Porosity Detection FP16-INT8 - CPUopenvino: Vehicle Detection FP16 - CPUopenvino: Vehicle Detection FP16 - CPUopenvino: Weld Porosity Detection FP16 - CPUopenvino: Weld Porosity Detection FP16 - CPUospray: gravity_spheres_volume/dim_512/scivis/real_timeospray: gravity_spheres_volume/dim_512/ao/real_timeospray: gravity_spheres_volume/dim_512/pathtracer/real_timedeepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Streamdeepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Streamblender: Pabellon Barcelona - CPU-Onlydeepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Streamdeepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Streamtensorflow: CPU - 32 - ResNet-50laghos: Triple Point Problemdeepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Streamtensorflow: CPU - 512 - AlexNetdeepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Streamdeepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Streamdeepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Streamdeepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Streamblender: Classroom - CPU-Onlydeepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Streamdeepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Streamdeepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Streambuild-linux-kernel: defconfiggpaw: Carbon Nanotubetensorflow: CPU - 16 - ResNet-50compress-7zip: Decompression Ratingcompress-7zip: Compression Ratingbuild-php: Time To Compilestress-ng: CPU Cachesrsran: PUSCH Processor Benchmark, Throughput Totalstress-ng: Atomicstress-ng: MMAPstress-ng: Pthreadstress-ng: MEMFDliquid-dsp: 192 - 256 - 512stress-ng: Mallocliquid-dsp: 128 - 256 - 512stress-ng: Zlibstress-ng: Matrix 3D Mathliquid-dsp: 64 - 256 - 512stress-ng: Glibc Qsort Data Sortingstress-ng: Vector Mathstress-ng: Forkingstress-ng: NUMAstress-ng: Futexstress-ng: Glibc C String Functionsstress-ng: Pollstress-ng: Wide Vector Mathstress-ng: AVL Treestress-ng: Memory Copyingstress-ng: Floating Pointstress-ng: Function Callliquid-dsp: 192 - 256 - 57stress-ng: Fused Multiply-Addstress-ng: Cryptostress-ng: Pipestress-ng: Hashstress-ng: Vector Floating Pointstress-ng: Vector Shufflestress-ng: Mutexliquid-dsp: 192 - 256 - 32liquid-dsp: 128 - 256 - 57stress-ng: Semaphoresstress-ng: CPU Stressstress-ng: System V Message Passingstress-ng: Matrix Mathliquid-dsp: 128 - 256 - 32liquid-dsp: 64 - 256 - 57liquid-dsp: 64 - 256 - 32askap: tConvolve MPI - Griddingaskap: tConvolve MPI - Degriddingheffte: c2c - FFTW - double - 512namd: ATPase Simulation - 327,506 Atomspyhpc: CPU - Numpy - 4194304 - Equation of Statetensorflow: CPU - 64 - GoogLeNetamg: gromacs: MPI CPU - water_GMX50_baretensorflow: CPU - 256 - AlexNetspecfem3d: Water-layered Halfspacespecfem3d: Layered Halfspaceblender: Fishy Cat - CPU-Onlyremhos: Sample Remap Examplelulesh: heffte: r2c - FFTW - double - 512xmrig: Monero - 1Msrsran: Downlink Processor Benchmarkopenfoam: drivaerFastback, Small Mesh Size - Execution Timeopenfoam: drivaerFastback, Small Mesh Size - Mesh Timexmrig: Wownero - 1Membree: Pathtracer ISPC - Asian Dragon Objtensorflow: CPU - 16 - GoogLeNetspecfem3d: Tomographic Modelspecfem3d: Mount St. Helensspecfem3d: Homogeneous Halfspacenpb: EP.Dmocassin: Gas HII40heffte: c2c - FFTW - float - 512tensorflow: CPU - 64 - AlexNetaskap: Hogbom Clean OpenMPtensorflow: CPU - 32 - GoogLeNetcloverleaf: Lagrangian-Eulerian Hydrodynamicsblender: BMW27 - CPU-Onlynpb: BT.Castcenc: Thoroughastcenc: Exhaustivenpb: SP.Cminife: Smalltensorflow: CPU - 16 - AlexNetincompact3d: input.i3d 129 Cells Per Directiondraco: Liondraco: Church Facadeaskap: tConvolve OpenMP - Degriddingaskap: tConvolve OpenMP - Griddingnpb: IS.Dincompact3d: input.i3d 193 Cells Per Directiontensorflow: CPU - 32 - AlexNetembree: Pathtracer ISPC - Crownmt-dgemm: Sustained Floating-Point Ratenpb: LU.Cheffte: r2c - FFTW - float - 512libxsmm: 32astcenc: Mediumlibxsmm: 64embree: Pathtracer ISPC - Asian Dragonnpb: FT.Cnpb: CG.Cnpb: MG.Cheffte: c2c - FFTW - double - 256heffte: r2c - FFTW - double - 256heffte: c2c - FFTW - float - 256heffte: r2c - FFTW - float - 256heffte: c2c - FFTW - double - 128heffte: r2c - FFTW - double - 128heffte: c2c - FFTW - float - 128heffte: r2c - FFTW - float - 128stress-ng: x86_64 RdRandDefault11269.2628994.4651585.13662272616.799424.614322.83321444.135832913.423.836926.8250769.11381115.51317.8209760409.02116.1511884332.0151415603.213582.83064.439.893202.557190.071184.047328.713370.189199.695586.502960.2415480601.674790428.16142.0325.1088137.6914039223404087012478.741517063.84118.39434341340071261112.88710651059105.227100.850181.01759108.3682317078591.534169532022325.157188.3841.5781357.88.5294.50387.68297289892446.931433.417.31766.5426.891753.6327.111003.5147.62529.6890.361653.3329.31534.5889.91498.83989.71648.1876.91639.91226.78.365732.7589.50535.770.8494505.118.455672.061.3364461.9310.848846.6712.423860.6410.284663.9725.927226.792326.5145417.8584114.177749.61783.575860.6307785.212260.584078.11238.89185.8506257.57121287.81145.0544329.977339.30751219.495840.5193.3229513.186460.0637797.9471139.0157344.405322.91934.77457.2162647264706333.4461397574.7518408.1237.831444.92103280.79454.751287866667360348999.65108110000010471.1716595.287275233332108.30545725.7340400.832478.543985709.5681208060.1713269096.623485374.331665.5732994.1829883.9367313.39463463333376566577.12202368.6160302912.6718746640.55257925.6163761.2949736118.0649585333333876733333223213939.86212380.7612085427.22418033.1337687333332609466667218126666773226.759791.167.85030.247330.767296.19241428300011.7931196.1621.11805134821.34796928920.6210.29330715.328134.82069684.3721.329.87796322.97411774009.7123.5135156.948.7389035858.89814524011.32314013610697.9013.278153.844786.601212.17238.1810.2916.28314777.9056.77806.1411207614.7054408.2303.802.175216135011604355153.026625.65696.517.68528681518.15117.826140.553180337910.64332.7421311.3419.42652455.4143.9954118831.8959737.94137308.2787.0392190.126178.441315.81379.9238129.875126.028185.389OpenBenchmarking.org

WRF

Input: conus 2.5km

OpenBenchmarking.orgSeconds, Fewer Is BetterWRF 4.2.2Input: conus 2.5kmDefault2K4K6K8K10K11269.261. (F9X) gfortran options: -O2 -ftree-vectorize -funroll-loops -ffree-form -fconvert=big-endian -frecord-marker=4 -fallow-invalid-boz -lesmf_time -lwrfio_nf -lnetcdff -lnetcdf -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenFOAM

Input: drivaerFastback, Large Mesh Size - Execution Time

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Large Mesh Size - Execution TimeDefault2K4K6K8K10K8994.471. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

OpenFOAM

Input: drivaerFastback, Large Mesh Size - Mesh Time

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Large Mesh Size - Mesh TimeDefault130260390520650585.141. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

PETSc

Test: Streams

OpenBenchmarking.orgMB/s, More Is BetterPETSc 3.19Test: StreamsDefault60K120K180K240K300KSE +/- 6007.86, N = 9272616.801. (CC) gcc options: -fPIC -O3 -O2 -lpthread -ludev -lpciaccess -lm

High Performance Conjugate Gradient

X Y Z: 144 144 144 - RT: 60

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 144 144 144 - RT: 60Default612182430SE +/- 0.44, N = 924.611. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

High Performance Conjugate Gradient

X Y Z: 192 192 192 - RT: 60

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 192 192 192 - RT: 60Default510152025SE +/- 0.18, N = 322.831. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

Whisper.cpp

Model: ggml-medium.en - Input: 2016 State of the Union

OpenBenchmarking.orgSeconds, Fewer Is BetterWhisper.cpp 1.4Model: ggml-medium.en - Input: 2016 State of the UnionDefault30060090012001500SE +/- 5.78, N = 31444.141. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread

libxsmm

M N K: 128

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 128Default6001200180024003000SE +/- 27.97, N = 92913.41. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

High Performance Conjugate Gradient

X Y Z: 160 160 160 - RT: 60

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 160 160 160 - RT: 60Default612182430SE +/- 0.34, N = 323.841. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

High Performance Conjugate Gradient

X Y Z: 104 104 104 - RT: 60

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 104 104 104 - RT: 60Default612182430SE +/- 0.82, N = 926.831. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

Whisper.cpp

Model: ggml-small.en - Input: 2016 State of the Union

OpenBenchmarking.orgSeconds, Fewer Is BetterWhisper.cpp 1.4Model: ggml-small.en - Input: 2016 State of the UnionDefault170340510680850SE +/- 3.33, N = 3769.111. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread

TensorFlow

Device: CPU - Batch Size: 256 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 256 - Model: ResNet-50Default306090120150SE +/- 1.00, N = 9115.51

Palabos

Grid Size: 400

OpenBenchmarking.orgMega Site Updates Per Second, More Is BetterPalabos 2.3Grid Size: 400Default70140210280350SE +/- 2.86, N = 12317.821. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm

LeelaChessZero

Backend: BLAS

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: BLASDefault2K4K6K8K10KSE +/- 103.04, N = 597601. (CXX) g++ options: -flto -pthread

TensorFlow

Device: CPU - Batch Size: 512 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 512 - Model: GoogLeNetDefault90180270360450SE +/- 3.82, N = 12409.02

TensorFlow

Device: CPU - Batch Size: 512 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 512 - Model: ResNet-50Default306090120150SE +/- 0.93, N = 3116.15

LeelaChessZero

Backend: Eigen

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: EigenDefault3K6K9K12K15KSE +/- 72.53, N = 3118841. (CXX) g++ options: -flto -pthread

Whisper.cpp

Model: ggml-base.en - Input: 2016 State of the Union

OpenBenchmarking.orgSeconds, Fewer Is BetterWhisper.cpp 1.4Model: ggml-base.en - Input: 2016 State of the UnionDefault70140210280350SE +/- 1.67, N = 3332.021. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread

ASKAP

Test: tConvolve MT - Degridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve MT - DegriddingDefault3K6K9K12K15KSE +/- 16.78, N = 315603.21. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

ASKAP

Test: tConvolve MT - Gridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve MT - GriddingDefault3K6K9K12K15KSE +/- 1.20, N = 313582.81. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

libxsmm

M N K: 256

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 256Default7001400210028003500SE +/- 32.92, N = 53064.41. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

LAMMPS Molecular Dynamics Simulator

Model: 20k Atoms

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 23Jun2022Model: 20k AtomsDefault918273645SE +/- 0.07, N = 339.891. (CXX) g++ options: -O3 -lm -ldl

Timed Linux Kernel Compilation

Build: allmodconfig

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 6.1Build: allmodconfigDefault4080120160200SE +/- 0.58, N = 3202.56

Monte Carlo Simulations of Ionised Nebulae

Input: Dust 2D tau100.0

OpenBenchmarking.orgSeconds, Fewer Is BetterMonte Carlo Simulations of Ionised Nebulae 2.02.73.3Input: Dust 2D tau100.0Default4080120160200SE +/- 1.70, N = 3190.071. (F9X) gfortran options: -cpp -Jsource/ -ffree-line-length-0 -lm -std=legacy -O2 -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lz

Timed LLVM Compilation

Build System: Unix Makefiles

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 16.0Build System: Unix MakefilesDefault4080120160200SE +/- 1.07, N = 3184.05

Palabos

Grid Size: 500

OpenBenchmarking.orgMega Site Updates Per Second, More Is BetterPalabos 2.3Grid Size: 500Default70140210280350SE +/- 0.10, N = 3328.711. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm

Palabos

Grid Size: 1000

OpenBenchmarking.orgMega Site Updates Per Second, More Is BetterPalabos 2.3Grid Size: 1000Default80160240320400SE +/- 0.10, N = 3370.191. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm

OSPRay

Benchmark: particle_volume/pathtracer/real_time

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.12Benchmark: particle_volume/pathtracer/real_timeDefault4080120160200SE +/- 0.09, N = 3199.70

Numpy Benchmark

OpenBenchmarking.orgScore, More Is BetterNumpy BenchmarkDefault130260390520650SE +/- 0.96, N = 3586.50

Stress-NG

Test: Socket Activity

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Socket ActivityDefault6001200180024003000SE +/- 1108.03, N = 152960.241. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: Context Switching

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Context SwitchingDefault3M6M9M12M15MSE +/- 644041.01, N = 1515480601.671. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: IO_uring

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: IO_uringDefault1000K2000K3000K4000K5000KSE +/- 576446.94, N = 124790428.161. (CXX) g++ options: -O2 -std=gnu99 -lc

Blender

Blend File: Barbershop - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.6Blend File: Barbershop - Compute: CPU-OnlyDefault306090120150SE +/- 0.01, N = 3142.03

OSPRay

Benchmark: particle_volume/scivis/real_time

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.12Benchmark: particle_volume/scivis/real_timeDefault612182430SE +/- 0.01, N = 325.11

Timed Gem5 Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Gem5 Compilation 21.2Time To CompileDefault306090120150SE +/- 0.72, N = 3137.69

OSPRay Studio

Camera: 3 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 3 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path TracerDefault9K18K27K36K45KSE +/- 43.02, N = 3403921. (CXX) g++ options: -O3 -lm -ldl

asmFish

1024 Hash Memory, 26 Depth

OpenBenchmarking.orgNodes/second, More Is BetterasmFish 2018-07-231024 Hash Memory, 26 DepthDefault50M100M150M200M250MSE +/- 1310608.25, N = 3234040870

Stress-NG

Test: Cloning

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: CloningDefault3K6K9K12K15KSE +/- 972.22, N = 1212478.741. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: SENDFILE

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: SENDFILEDefault300K600K900K1200K1500KSE +/- 117132.40, N = 121517063.841. (CXX) g++ options: -O2 -std=gnu99 -lc

Ngspice

Circuit: C2670

OpenBenchmarking.orgSeconds, Fewer Is BetterNgspice 34Circuit: C2670Default306090120150SE +/- 0.26, N = 3118.391. (CC) gcc options: -O0 -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lXft -lfontconfig -lXrender -lfreetype -lSM -lICE

OSPRay Studio

Camera: 2 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 2 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path TracerDefault7K14K21K28K35KSE +/- 51.48, N = 3343411. (CXX) g++ options: -O3 -lm -ldl

OSPRay Studio

Camera: 1 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 1 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path TracerDefault7K14K21K28K35KSE +/- 22.67, N = 3340071. (CXX) g++ options: -O3 -lm -ldl

OSPRay Studio

Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path TracerDefault30060090012001500SE +/- 0.33, N = 312611. (CXX) g++ options: -O3 -lm -ldl

Timed LLVM Compilation

Build System: Ninja

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 16.0Build System: NinjaDefault306090120150SE +/- 0.31, N = 3112.89

OSPRay Studio

Camera: 2 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 2 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path TracerDefault2004006008001000SE +/- 2.33, N = 310651. (CXX) g++ options: -O3 -lm -ldl

OSPRay Studio

Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path TracerDefault2004006008001000SE +/- 1.15, N = 310591. (CXX) g++ options: -O3 -lm -ldl

Timed Node.js Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Node.js Compilation 19.8.1Time To CompileDefault20406080100SE +/- 0.15, N = 3105.23

Ngspice

Circuit: C7552

OpenBenchmarking.orgSeconds, Fewer Is BetterNgspice 34Circuit: C7552Default20406080100SE +/- 0.08, N = 3100.851. (CC) gcc options: -O0 -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lXft -lfontconfig -lXrender -lfreetype -lSM -lICE

OpenFOAM

Input: drivaerFastback, Medium Mesh Size - Execution Time

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Medium Mesh Size - Execution TimeDefault4080120160200181.021. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

OpenFOAM

Input: drivaerFastback, Medium Mesh Size - Mesh Time

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Medium Mesh Size - Mesh TimeDefault20406080100108.371. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

OSPRay Studio

Camera: 2 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 2 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path TracerDefault4K8K12K16K20KSE +/- 6.03, N = 3170781. (CXX) g++ options: -O3 -lm -ldl

Palabos

Grid Size: 100

OpenBenchmarking.orgMega Site Updates Per Second, More Is BetterPalabos 2.3Grid Size: 100Default130260390520650SE +/- 4.57, N = 3591.531. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm

OSPRay Studio

Camera: 1 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 1 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path TracerDefault4K8K12K16K20KSE +/- 50.26, N = 3169531. (CXX) g++ options: -O3 -lm -ldl

OSPRay Studio

Camera: 3 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 3 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path TracerDefault4K8K12K16K20KSE +/- 50.67, N = 3202231. (CXX) g++ options: -O3 -lm -ldl

OSPRay

Benchmark: particle_volume/ao/real_time

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.12Benchmark: particle_volume/ao/real_timeDefault612182430SE +/- 0.01, N = 325.16

Timed Godot Game Engine Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Godot Game Engine Compilation 4.0Time To CompileDefault20406080100SE +/- 0.06, N = 388.38

PyHPC Benchmarks

Device: CPU - Backend: Numpy - Project Size: 4194304 - Benchmark: Isoneutral Mixing

OpenBenchmarking.orgSeconds, Fewer Is BetterPyHPC Benchmarks 3.0Device: CPU - Backend: Numpy - Project Size: 4194304 - Benchmark: Isoneutral MixingDefault0.35510.71021.06531.42041.7755SE +/- 0.001, N = 31.578

Zstd Compression

Compression Level: 19, Long Mode - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19, Long Mode - Decompression SpeedDefault30060090012001500SE +/- 2.33, N = 31357.81. (CC) gcc options: -O3 -pthread -lz -llzma

Zstd Compression

Compression Level: 19, Long Mode - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19, Long Mode - Compression SpeedDefault246810SE +/- 0.01, N = 38.521. (CC) gcc options: -O3 -pthread -lz -llzma

TensorFlow

Device: CPU - Batch Size: 64 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 64 - Model: ResNet-50Default20406080100SE +/- 0.09, N = 394.50

TensorFlow

Device: CPU - Batch Size: 256 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 256 - Model: GoogLeNetDefault80160240320400SE +/- 0.40, N = 3387.68

Stockfish

Total Time

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 15Total TimeDefault60M120M180M240M300MSE +/- 1047576.89, N = 32972898921. (CXX) g++ options: -lgcov -m64 -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -msse -msse3 -mpopcnt -mavx2 -mavx512f -mavx512bw -mavx512vnni -mavx512dq -mavx512vl -msse4.1 -mssse3 -msse2 -mbmi2 -flto -flto=jobserver

Laghos

Test: Sedov Blast Wave, ube_922_hex.mesh

OpenBenchmarking.orgMajor Kernels Total Rate, More Is BetterLaghos 3.1Test: Sedov Blast Wave, ube_922_hex.meshDefault100200300400500SE +/- 1.52, N = 3446.931. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi

Zstd Compression

Compression Level: 19 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19 - Decompression SpeedDefault30060090012001500SE +/- 6.29, N = 31433.41. (CC) gcc options: -O3 -pthread -lz -llzma

Zstd Compression

Compression Level: 19 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19 - Compression SpeedDefault48121620SE +/- 0.03, N = 317.31. (CC) gcc options: -O3 -pthread -lz -llzma

OpenVINO

Model: Person Detection FP32 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Person Detection FP32 - Device: CPUDefault400800120016002000SE +/- 6.69, N = 31766.54MIN: 966.02 / MAX: 2507.191. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenVINO

Model: Person Detection FP32 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Person Detection FP32 - Device: CPUDefault612182430SE +/- 0.10, N = 326.891. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenVINO

Model: Person Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Person Detection FP16 - Device: CPUDefault400800120016002000SE +/- 4.64, N = 31753.63MIN: 955.99 / MAX: 2502.091. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenVINO

Model: Person Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Person Detection FP16 - Device: CPUDefault612182430SE +/- 0.10, N = 327.111. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenVINO

Model: Face Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Face Detection FP16 - Device: CPUDefault2004006008001000SE +/- 0.35, N = 31003.51MIN: 498.16 / MAX: 1059.951. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenVINO

Model: Face Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Face Detection FP16 - Device: CPUDefault1122334455SE +/- 0.02, N = 347.621. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenVINO

Model: Face Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Face Detection FP16-INT8 - Device: CPUDefault110220330440550SE +/- 0.05, N = 3529.68MIN: 257.02 / MAX: 560.931. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenVINO

Model: Face Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Face Detection FP16-INT8 - Device: CPUDefault20406080100SE +/- 0.01, N = 390.361. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

Zstd Compression

Compression Level: 12 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 12 - Decompression SpeedDefault400800120016002000SE +/- 2.05, N = 31653.31. (CC) gcc options: -O3 -pthread -lz -llzma

Zstd Compression

Compression Level: 12 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 12 - Compression SpeedDefault70140210280350SE +/- 1.73, N = 3329.31. (CC) gcc options: -O3 -pthread -lz -llzma

Zstd Compression

Compression Level: 3, Long Mode - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 3, Long Mode - Decompression SpeedDefault30060090012001500SE +/- 2.41, N = 31534.51. (CC) gcc options: -O3 -pthread -lz -llzma

Zstd Compression

Compression Level: 3, Long Mode - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 3, Long Mode - Compression SpeedDefault2004006008001000SE +/- 11.65, N = 3889.91. (CC) gcc options: -O3 -pthread -lz -llzma

Zstd Compression

Compression Level: 3 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 3 - Decompression SpeedDefault30060090012001500SE +/- 1.17, N = 31498.81. (CC) gcc options: -O3 -pthread -lz -llzma

Zstd Compression

Compression Level: 3 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 3 - Compression SpeedDefault9001800270036004500SE +/- 22.49, N = 33989.71. (CC) gcc options: -O3 -pthread -lz -llzma

Zstd Compression

Compression Level: 8, Long Mode - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 8, Long Mode - Decompression SpeedDefault400800120016002000SE +/- 0.64, N = 31648.11. (CC) gcc options: -O3 -pthread -lz -llzma

Zstd Compression

Compression Level: 8, Long Mode - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 8, Long Mode - Compression SpeedDefault2004006008001000SE +/- 3.55, N = 3876.91. (CC) gcc options: -O3 -pthread -lz -llzma

Zstd Compression

Compression Level: 8 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 8 - Decompression SpeedDefault400800120016002000SE +/- 2.25, N = 31639.91. (CC) gcc options: -O3 -pthread -lz -llzma

Zstd Compression

Compression Level: 8 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 8 - Compression SpeedDefault30060090012001500SE +/- 1.56, N = 31226.71. (CC) gcc options: -O3 -pthread -lz -llzma

OpenVINO

Model: Person Vehicle Bike Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Person Vehicle Bike Detection FP16 - Device: CPUDefault246810SE +/- 0.00, N = 38.36MIN: 4.99 / MAX: 31.021. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenVINO

Model: Person Vehicle Bike Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Person Vehicle Bike Detection FP16 - Device: CPUDefault12002400360048006000SE +/- 2.96, N = 35732.751. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenVINO

Model: Machine Translation EN To DE FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Machine Translation EN To DE FP16 - Device: CPUDefault20406080100SE +/- 0.11, N = 389.50MIN: 39.21 / MAX: 143.981. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenVINO

Model: Machine Translation EN To DE FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Machine Translation EN To DE FP16 - Device: CPUDefault120240360480600SE +/- 0.72, N = 3535.771. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Age Gender Recognition Retail 0013 FP16 - Device: CPUDefault0.1890.3780.5670.7560.945SE +/- 0.00, N = 30.84MIN: 0.3 / MAX: 12.711. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Age Gender Recognition Retail 0013 FP16 - Device: CPUDefault20K40K60K80K100KSE +/- 57.42, N = 394505.111. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenVINO

Model: Vehicle Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Vehicle Detection FP16-INT8 - Device: CPUDefault246810SE +/- 0.00, N = 38.45MIN: 4.5 / MAX: 30.541. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenVINO

Model: Vehicle Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Vehicle Detection FP16-INT8 - Device: CPUDefault12002400360048006000SE +/- 0.38, N = 35672.061. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUDefault0.29930.59860.89791.19721.4965SE +/- 0.00, N = 31.33MIN: 0.47 / MAX: 14.671. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUDefault14K28K42K56K70KSE +/- 63.50, N = 364461.931. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenVINO

Model: Weld Porosity Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Weld Porosity Detection FP16-INT8 - Device: CPUDefault3691215SE +/- 0.00, N = 310.84MIN: 5.62 / MAX: 24.411. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenVINO

Model: Weld Porosity Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Weld Porosity Detection FP16-INT8 - Device: CPUDefault2K4K6K8K10KSE +/- 0.12, N = 38846.671. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenVINO

Model: Vehicle Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Vehicle Detection FP16 - Device: CPUDefault3691215SE +/- 0.00, N = 312.42MIN: 5.8 / MAX: 47.231. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenVINO

Model: Vehicle Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Vehicle Detection FP16 - Device: CPUDefault8001600240032004000SE +/- 0.30, N = 33860.641. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenVINO

Model: Weld Porosity Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Weld Porosity Detection FP16 - Device: CPUDefault3691215SE +/- 0.00, N = 310.28MIN: 5.38 / MAX: 28.561. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenVINO

Model: Weld Porosity Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Weld Porosity Detection FP16 - Device: CPUDefault10002000300040005000SE +/- 0.19, N = 34663.971. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OSPRay

Benchmark: gravity_spheres_volume/dim_512/scivis/real_time

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.12Benchmark: gravity_spheres_volume/dim_512/scivis/real_timeDefault612182430SE +/- 0.04, N = 325.93

OSPRay

Benchmark: gravity_spheres_volume/dim_512/ao/real_time

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.12Benchmark: gravity_spheres_volume/dim_512/ao/real_timeDefault612182430SE +/- 0.12, N = 326.79

OSPRay

Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.12Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_timeDefault612182430SE +/- 0.02, N = 326.51

Neural Magic DeepSparse

Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-StreamDefault90180270360450SE +/- 0.35, N = 3417.86

Neural Magic DeepSparse

Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-StreamDefault306090120150SE +/- 0.07, N = 3114.18

Blender

Blend File: Pabellon Barcelona - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.6Blend File: Pabellon Barcelona - Compute: CPU-OnlyDefault1122334455SE +/- 0.08, N = 349.61

Neural Magic DeepSparse

Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-StreamDefault2004006008001000SE +/- 1.88, N = 3783.58

Neural Magic DeepSparse

Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-StreamDefault1428425670SE +/- 0.03, N = 360.63

Neural Magic DeepSparse

Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-StreamDefault2004006008001000SE +/- 0.16, N = 3785.21

Neural Magic DeepSparse

Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-StreamDefault1428425670SE +/- 0.02, N = 360.58

TensorFlow

Device: CPU - Batch Size: 32 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 32 - Model: ResNet-50Default20406080100SE +/- 0.18, N = 378.11

Laghos

Test: Triple Point Problem

OpenBenchmarking.orgMajor Kernels Total Rate, More Is BetterLaghos 3.1Test: Triple Point ProblemDefault50100150200250SE +/- 1.65, N = 3238.891. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi

Neural Magic DeepSparse

Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-StreamDefault4080120160200SE +/- 0.04, N = 3185.85

Neural Magic DeepSparse

Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-StreamDefault60120180240300SE +/- 0.06, N = 3257.57

TensorFlow

Device: CPU - Batch Size: 512 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 512 - Model: AlexNetDefault30060090012001500SE +/- 0.67, N = 31287.81

Neural Magic DeepSparse

Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-StreamDefault306090120150SE +/- 0.07, N = 3145.05

Neural Magic DeepSparse

Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-StreamDefault70140210280350SE +/- 0.25, N = 3329.98

Neural Magic DeepSparse

Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-StreamDefault918273645SE +/- 0.03, N = 339.31

Neural Magic DeepSparse

Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-StreamDefault30060090012001500SE +/- 0.89, N = 31219.50

Blender

Blend File: Classroom - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.6Blend File: Classroom - Compute: CPU-OnlyDefault918273645SE +/- 0.11, N = 340.51

Neural Magic DeepSparse

Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-StreamDefault20406080100SE +/- 0.02, N = 393.32

Neural Magic DeepSparse

Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-StreamDefault110220330440550SE +/- 0.30, N = 3513.19

Neural Magic DeepSparse

Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-StreamDefault1326395265SE +/- 0.03, N = 360.06

Neural Magic DeepSparse

Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-StreamDefault2004006008001000SE +/- 0.46, N = 3797.95

Neural Magic DeepSparse

Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-StreamDefault306090120150SE +/- 0.13, N = 3139.02

Neural Magic DeepSparse

Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-StreamDefault70140210280350SE +/- 0.41, N = 3344.41

Timed Linux Kernel Compilation

Build: defconfig

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 6.1Build: defconfigDefault510152025SE +/- 0.25, N = 522.92

GPAW

Input: Carbon Nanotube

OpenBenchmarking.orgSeconds, Fewer Is BetterGPAW 23.6Input: Carbon NanotubeDefault816243240SE +/- 0.18, N = 334.771. (CC) gcc options: -shared -fwrapv -O2 -lxc -lblas -lmpi

TensorFlow

Device: CPU - Batch Size: 16 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 16 - Model: ResNet-50Default1326395265SE +/- 0.21, N = 357.21

7-Zip Compression

Test: Decompression Rating

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 22.01Test: Decompression RatingDefault130K260K390K520K650KSE +/- 1050.05, N = 36264721. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

7-Zip Compression

Test: Compression Rating

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 22.01Test: Compression RatingDefault140K280K420K560K700KSE +/- 469.85, N = 36470631. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

Timed PHP Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed PHP Compilation 8.1.9Time To CompileDefault816243240SE +/- 0.27, N = 333.45

Stress-NG

Test: CPU Cache

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: CPU CacheDefault300K600K900K1200K1500KSE +/- 11233.16, N = 31397574.751. (CXX) g++ options: -O2 -std=gnu99 -lc

srsRAN Project

Test: PUSCH Processor Benchmark, Throughput Total

OpenBenchmarking.orgMbps, More Is BettersrsRAN Project 23.5Test: PUSCH Processor Benchmark, Throughput TotalDefault4K8K12K16K20KSE +/- 23.08, N = 318408.11. (CXX) g++ options: -march=native -mfma -O3 -fno-trapping-math -fno-math-errno -lgtest

Stress-NG

Test: Atomic

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: AtomicDefault50100150200250SE +/- 0.05, N = 3237.831. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: MMAP

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: MMAPDefault30060090012001500SE +/- 0.78, N = 31444.921. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: Pthread

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: PthreadDefault20K40K60K80K100KSE +/- 525.98, N = 3103280.791. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: MEMFD

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: MEMFDDefault100200300400500SE +/- 0.83, N = 3454.751. (CXX) g++ options: -O2 -std=gnu99 -lc

Liquid-DSP

Threads: 192 - Buffer Length: 256 - Filter Length: 512

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 192 - Buffer Length: 256 - Filter Length: 512Default300M600M900M1200M1500MSE +/- 762306.44, N = 312878666671. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Stress-NG

Test: Malloc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: MallocDefault80M160M240M320M400MSE +/- 494466.73, N = 3360348999.651. (CXX) g++ options: -O2 -std=gnu99 -lc

Liquid-DSP

Threads: 128 - Buffer Length: 256 - Filter Length: 512

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 128 - Buffer Length: 256 - Filter Length: 512Default200M400M600M800M1000MSE +/- 953939.20, N = 310811000001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Stress-NG

Test: Zlib

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: ZlibDefault2K4K6K8K10KSE +/- 2.81, N = 310471.171. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: Matrix 3D Math

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Matrix 3D MathDefault4K8K12K16K20KSE +/- 154.68, N = 316595.281. (CXX) g++ options: -O2 -std=gnu99 -lc

Liquid-DSP

Threads: 64 - Buffer Length: 256 - Filter Length: 512

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 512Default160M320M480M640M800MSE +/- 2197592.12, N = 37275233331. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Stress-NG

Test: Glibc Qsort Data Sorting

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Glibc Qsort Data SortingDefault5001000150020002500SE +/- 0.52, N = 32108.301. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: Vector Math

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Vector MathDefault120K240K360K480K600KSE +/- 61.16, N = 3545725.731. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: Forking

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: ForkingDefault9K18K27K36K45KSE +/- 70.57, N = 340400.831. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: NUMA

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: NUMADefault5001000150020002500SE +/- 6.42, N = 32478.541. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: Futex

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: FutexDefault900K1800K2700K3600K4500KSE +/- 13786.79, N = 33985709.561. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: Glibc C String Functions

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Glibc C String FunctionsDefault20M40M60M80M100MSE +/- 512163.22, N = 381208060.171. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: Poll

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: PollDefault3M6M9M12M15MSE +/- 7203.48, N = 313269096.621. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: Wide Vector Math

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Wide Vector MathDefault700K1400K2100K2800K3500KSE +/- 964.27, N = 33485374.331. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: AVL Tree

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: AVL TreeDefault400800120016002000SE +/- 0.40, N = 31665.571. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: Memory Copying

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Memory CopyingDefault7K14K21K28K35KSE +/- 1.22, N = 332994.181. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: Floating Point

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Floating PointDefault6K12K18K24K30KSE +/- 16.24, N = 329883.931. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: Function Call

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Function CallDefault14K28K42K56K70KSE +/- 29.99, N = 367313.391. (CXX) g++ options: -O2 -std=gnu99 -lc

Liquid-DSP

Threads: 192 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 192 - Buffer Length: 256 - Filter Length: 57Default1000M2000M3000M4000M5000MSE +/- 3001296.02, N = 346346333331. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Stress-NG

Test: Fused Multiply-Add

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Fused Multiply-AddDefault16M32M48M64M80MSE +/- 17500.16, N = 376566577.121. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: Crypto

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: CryptoDefault40K80K120K160K200KSE +/- 436.93, N = 3202368.611. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: Pipe

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: PipeDefault13M26M39M52M65MSE +/- 612854.45, N = 360302912.671. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: Hash

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: HashDefault4M8M12M16M20MSE +/- 3937.95, N = 318746640.551. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: Vector Floating Point

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Vector Floating PointDefault60K120K180K240K300KSE +/- 151.27, N = 3257925.611. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: Vector Shuffle

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Vector ShuffleDefault14K28K42K56K70KSE +/- 3.12, N = 363761.291. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: Mutex

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: MutexDefault11M22M33M44M55MSE +/- 152725.58, N = 349736118.061. (CXX) g++ options: -O2 -std=gnu99 -lc

Liquid-DSP

Threads: 192 - Buffer Length: 256 - Filter Length: 32

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 192 - Buffer Length: 256 - Filter Length: 32Default1100M2200M3300M4400M5500MSE +/- 202758.75, N = 349585333331. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 128 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 128 - Buffer Length: 256 - Filter Length: 57Default800M1600M2400M3200M4000MSE +/- 4532230.26, N = 338767333331. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Stress-NG

Test: Semaphores

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: SemaphoresDefault50M100M150M200M250MSE +/- 2019374.88, N = 3223213939.861. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: CPU Stress

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: CPU StressDefault50K100K150K200K250KSE +/- 233.96, N = 3212380.761. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: System V Message Passing

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: System V Message PassingDefault3M6M9M12M15MSE +/- 5434.16, N = 312085427.221. (CXX) g++ options: -O2 -std=gnu99 -lc

Stress-NG

Test: Matrix Math

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Matrix MathDefault90K180K270K360K450KSE +/- 67.17, N = 3418033.131. (CXX) g++ options: -O2 -std=gnu99 -lc

Liquid-DSP

Threads: 128 - Buffer Length: 256 - Filter Length: 32

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 128 - Buffer Length: 256 - Filter Length: 32Default800M1600M2400M3200M4000MSE +/- 1471205.10, N = 337687333331. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 64 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 57Default600M1200M1800M2400M3000MSE +/- 11597461.41, N = 326094666671. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 64 - Buffer Length: 256 - Filter Length: 32

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 32Default500M1000M1500M2000M2500MSE +/- 995545.63, N = 321812666671. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

ASKAP

Test: tConvolve MPI - Gridding

OpenBenchmarking.orgMpix/sec, More Is BetterASKAP 1.0Test: tConvolve MPI - GriddingDefault16K32K48K64K80KSE +/- 0.00, N = 373226.71. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

ASKAP

Test: tConvolve MPI - Degridding

OpenBenchmarking.orgMpix/sec, More Is BetterASKAP 1.0Test: tConvolve MPI - DegriddingDefault13K26K39K52K65KSE +/- 380.83, N = 359791.11. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512Default1530456075SE +/- 0.10, N = 367.851. (CXX) g++ options: -O3

NAMD

ATPase Simulation - 327,506 Atoms

OpenBenchmarking.orgdays/ns, Fewer Is BetterNAMD 2.14ATPase Simulation - 327,506 AtomsDefault0.05560.11120.16680.22240.278SE +/- 0.00040, N = 30.24733

PyHPC Benchmarks

Device: CPU - Backend: Numpy - Project Size: 4194304 - Benchmark: Equation of State

OpenBenchmarking.orgSeconds, Fewer Is BetterPyHPC Benchmarks 3.0Device: CPU - Backend: Numpy - Project Size: 4194304 - Benchmark: Equation of StateDefault0.17260.34520.51780.69040.863SE +/- 0.004, N = 30.767

TensorFlow

Device: CPU - Batch Size: 64 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 64 - Model: GoogLeNetDefault60120180240300SE +/- 0.03, N = 3296.19

Algebraic Multi-Grid Benchmark

OpenBenchmarking.orgFigure Of Merit, More Is BetterAlgebraic Multi-Grid Benchmark 1.2Default500M1000M1500M2000M2500MSE +/- 4643811.51, N = 324142830001. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -lmpi

GROMACS

Implementation: MPI CPU - Input: water_GMX50_bare

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2023Implementation: MPI CPU - Input: water_GMX50_bareDefault3691215SE +/- 0.01, N = 311.791. (CXX) g++ options: -O3

TensorFlow

Device: CPU - Batch Size: 256 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 256 - Model: AlexNetDefault30060090012001500SE +/- 0.47, N = 31196.16

SPECFEM3D

Model: Water-layered Halfspace

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.0Model: Water-layered HalfspaceDefault510152025SE +/- 0.19, N = 321.121. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

SPECFEM3D

Model: Layered Halfspace

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.0Model: Layered HalfspaceDefault510152025SE +/- 0.10, N = 321.351. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

Blender

Blend File: Fishy Cat - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.6Blend File: Fishy Cat - Compute: CPU-OnlyDefault510152025SE +/- 0.09, N = 320.62

Remhos

Test: Sample Remap Example

OpenBenchmarking.orgSeconds, Fewer Is BetterRemhos 1.0Test: Sample Remap ExampleDefault3691215SE +/- 0.10, N = 610.291. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi

LULESH

OpenBenchmarking.orgz/s, More Is BetterLULESH 2.0.3Default7K14K21K28K35KSE +/- 182.94, N = 430715.331. (CXX) g++ options: -O3 -fopenmp -lm -lmpi_cxx -lmpi

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512Default306090120150SE +/- 0.58, N = 4134.821. (CXX) g++ options: -O3

Xmrig

Variant: Monero - Hash Count: 1M

OpenBenchmarking.orgH/s, More Is BetterXmrig 6.18.1Variant: Monero - Hash Count: 1MDefault15K30K45K60K75KSE +/- 83.67, N = 469684.31. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc

srsRAN Project

Test: Downlink Processor Benchmark

OpenBenchmarking.orgMbps, More Is BettersrsRAN Project 23.5Test: Downlink Processor BenchmarkDefault160320480640800SE +/- 6.73, N = 3721.31. (CXX) g++ options: -march=native -mfma -O3 -fno-trapping-math -fno-math-errno -lgtest

OpenFOAM

Input: drivaerFastback, Small Mesh Size - Execution Time

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Small Mesh Size - Execution TimeDefault71421283529.881. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

OpenFOAM

Input: drivaerFastback, Small Mesh Size - Mesh Time

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Small Mesh Size - Mesh TimeDefault61218243022.971. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

Xmrig

Variant: Wownero - Hash Count: 1M

OpenBenchmarking.orgH/s, More Is BetterXmrig 6.18.1Variant: Wownero - Hash Count: 1MDefault16K32K48K64K80KSE +/- 23.08, N = 474009.71. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc

Embree

Binary: Pathtracer ISPC - Model: Asian Dragon Obj

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer ISPC - Model: Asian Dragon ObjDefault306090120150SE +/- 0.05, N = 4123.51MIN: 121.61 / MAX: 126.87

TensorFlow

Device: CPU - Batch Size: 16 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 16 - Model: GoogLeNetDefault306090120150SE +/- 0.40, N = 4156.94

SPECFEM3D

Model: Tomographic Model

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.0Model: Tomographic ModelDefault246810SE +/- 0.026518372, N = 58.7389035851. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

SPECFEM3D

Model: Mount St. Helens

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.0Model: Mount St. HelensDefault246810SE +/- 0.087169113, N = 58.8981452401. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

SPECFEM3D

Model: Homogeneous Halfspace

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.0Model: Homogeneous HalfspaceDefault3691215SE +/- 0.14, N = 411.321. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

NAS Parallel Benchmarks

Test / Class: EP.D

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: EP.DDefault2K4K6K8K10KSE +/- 40.63, N = 410697.901. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

Monte Carlo Simulations of Ionised Nebulae

Input: Gas HII40

OpenBenchmarking.orgSeconds, Fewer Is BetterMonte Carlo Simulations of Ionised Nebulae 2.02.73.3Input: Gas HII40Default3691215SE +/- 0.03, N = 413.281. (F9X) gfortran options: -cpp -Jsource/ -ffree-line-length-0 -lm -std=legacy -O2 -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lz

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512Default306090120150SE +/- 0.85, N = 4153.841. (CXX) g++ options: -O3

TensorFlow

Device: CPU - Batch Size: 64 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 64 - Model: AlexNetDefault2004006008001000SE +/- 2.22, N = 5786.60

ASKAP

Test: Hogbom Clean OpenMP

OpenBenchmarking.orgIterations Per Second, More Is BetterASKAP 1.0Test: Hogbom Clean OpenMPDefault30060090012001500SE +/- 4.24, N = 41212.171. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

TensorFlow

Device: CPU - Batch Size: 32 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 32 - Model: GoogLeNetDefault50100150200250SE +/- 0.31, N = 3238.18

CloverLeaf

Lagrangian-Eulerian Hydrodynamics

OpenBenchmarking.orgSeconds, Fewer Is BetterCloverLeafLagrangian-Eulerian HydrodynamicsDefault3691215SE +/- 0.04, N = 510.291. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp

Blender

Blend File: BMW27 - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.6Blend File: BMW27 - Compute: CPU-OnlyDefault48121620SE +/- 0.10, N = 316.28

NAS Parallel Benchmarks

Test / Class: BT.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: BT.CDefault70K140K210K280K350KSE +/- 1672.02, N = 5314777.901. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

ASTC Encoder

Preset: Thorough

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 4.0Preset: ThoroughDefault1326395265SE +/- 0.02, N = 656.781. (CXX) g++ options: -O3 -flto -pthread

ASTC Encoder

Preset: Exhaustive

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 4.0Preset: ExhaustiveDefault246810SE +/- 0.0017, N = 46.14111. (CXX) g++ options: -O3 -flto -pthread

NAS Parallel Benchmarks

Test / Class: SP.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: SP.CDefault40K80K120K160K200KSE +/- 1595.09, N = 6207614.701. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

miniFE

Problem Size: Small

OpenBenchmarking.orgCG Mflops, More Is BetterminiFE 2.2Problem Size: SmallDefault12K24K36K48K60KSE +/- 227.90, N = 554408.21. (CXX) g++ options: -O3 -fopenmp -lmpi_cxx -lmpi

TensorFlow

Device: CPU - Batch Size: 16 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 16 - Model: AlexNetDefault70140210280350SE +/- 0.49, N = 6303.80

Xcompact3d Incompact3d

Input: input.i3d 129 Cells Per Direction

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 129 Cells Per DirectionDefault0.48940.97881.46821.95762.447SE +/- 0.04209465, N = 152.175216131. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

Google Draco

Model: Lion

OpenBenchmarking.orgms, Fewer Is BetterGoogle Draco 1.5.6Model: LionDefault11002200330044005500SE +/- 2.98, N = 750111. (CXX) g++ options: -O3

Google Draco

Model: Church Facade

OpenBenchmarking.orgms, Fewer Is BetterGoogle Draco 1.5.6Model: Church FacadeDefault13002600390052006500SE +/- 3.90, N = 660431. (CXX) g++ options: -O3

ASKAP

Test: tConvolve OpenMP - Degridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve OpenMP - DegriddingDefault12K24K36K48K60KSE +/- 1901.83, N = 755153.01. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

ASKAP

Test: tConvolve OpenMP - Gridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve OpenMP - GriddingDefault6K12K18K24K30KSE +/- 0.00, N = 726625.61. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

NAS Parallel Benchmarks

Test / Class: IS.D

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: IS.DDefault12002400360048006000SE +/- 29.57, N = 55696.511. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

Xcompact3d Incompact3d

Input: input.i3d 193 Cells Per Direction

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 193 Cells Per DirectionDefault246810SE +/- 0.06675033, N = 57.685286811. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

TensorFlow

Device: CPU - Batch Size: 32 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 32 - Model: AlexNetDefault110220330440550SE +/- 0.63, N = 5518.15

Embree

Binary: Pathtracer ISPC - Model: Crown

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer ISPC - Model: CrownDefault306090120150SE +/- 0.08, N = 7117.83MIN: 114.92 / MAX: 122.62

ACES DGEMM

Sustained Floating-Point Rate

OpenBenchmarking.orgGFLOP/s, More Is BetterACES DGEMM 1.0Sustained Floating-Point RateDefault918273645SE +/- 0.13, N = 740.551. (CC) gcc options: -O3 -march=native -fopenmp

NAS Parallel Benchmarks

Test / Class: LU.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: LU.CDefault70K140K210K280K350KSE +/- 1381.97, N = 6337910.641. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512Default70140210280350SE +/- 1.86, N = 6332.741. (CXX) g++ options: -O3

libxsmm

M N K: 32

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 32Default30060090012001500SE +/- 1.58, N = 81311.31. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

ASTC Encoder

Preset: Medium

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 4.0Preset: MediumDefault90180270360450SE +/- 0.05, N = 8419.431. (CXX) g++ options: -O3 -flto -pthread

libxsmm

M N K: 64

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 64Default5001000150020002500SE +/- 4.53, N = 72455.41. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

Embree

Binary: Pathtracer ISPC - Model: Asian Dragon

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer ISPC - Model: Asian DragonDefault306090120150SE +/- 0.10, N = 7144.00MIN: 141.47 / MAX: 149.12

NAS Parallel Benchmarks

Test / Class: FT.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: FT.CDefault30K60K90K120K150KSE +/- 329.27, N = 8118831.891. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

NAS Parallel Benchmarks

Test / Class: CG.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: CG.CDefault13K26K39K52K65KSE +/- 470.01, N = 1059737.941. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

NAS Parallel Benchmarks

Test / Class: MG.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: MG.CDefault30K60K90K120K150KSE +/- 1630.26, N = 15137308.271. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256Default20406080100SE +/- 0.52, N = 987.041. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256Default4080120160200SE +/- 1.48, N = 15190.131. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256Default4080120160200SE +/- 1.06, N = 11178.441. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256Default70140210280350SE +/- 1.00, N = 12315.811. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128Default20406080100SE +/- 0.30, N = 1379.921. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128Default306090120150SE +/- 0.54, N = 13129.881. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128Default306090120150SE +/- 0.76, N = 13126.031. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128Default4080120160200SE +/- 0.43, N = 13185.391. (CXX) g++ options: -O3


Phoronix Test Suite v10.8.4