Benchmarks by Michael Larabel for a future article.
GPTshop.ai GH200 Processor: ARMv8 Neoverse-V2 @ 3.39GHz (72 Cores), Motherboard: Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS), Memory: 1 x 480GB DRAM-6400MT/s, Disk: 960GB SAMSUNG MZ1L2960HCJR-00A07 + 1920GB SAMSUNG MZTL21T9, Graphics: ASPEED, Network: 2 x Mellanox MT2910 + 2 x QLogic FastLinQ QL41000 10/25/40/50GbE
OS: Ubuntu 23.10, Kernel: 6.5.0-15-generic (aarch64), Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1920x1200
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -vProcessor Notes: Scaling Governor: cppc_cpufreq performance (Boost: Disabled)Python Notes: Python 3.11.6Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
HP Z6 G5 A - Threadripper PRO 7995WX Processor: AMD Ryzen Threadripper PRO 7995WX 96-Cores @ 6.44GHz (96 Cores / 192 Threads) , Motherboard: HP Z6 G5 A Workstation 8B24 (U65 Ver. 01.01.04 BIOS) , Chipset: AMD Device 14a4 , Memory: 8 x 16GB DRAM-5200MT/s Hynix HMCG78AGBRA190N , Disk: 2 x 1024GB SAMSUNG MZVL21T0HCLR-00BH1 , Graphics: NVIDIA RTX A4000 16GB , Audio: NVIDIA GA104 HD Audio, Monitor: ASUS VP28U, Network: Realtek RTL8111/8168/8411
OS: Ubuntu 23.10, Kernel: 6.5.0-17-generic (x86_64), Desktop: GNOME Shell 45.2, Display Server: X Server 1.21.1.7, Display Driver: NVIDIA 535.154.05, OpenGL: 4.6.0, OpenCL: OpenCL 3.0 CUDA 12.2.148, Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 3840x2160
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: amd-pstate-epp performance (EPP: performance) - CPU Microcode: 0xa108105OpenCL Notes: GPU Compute Cores: 6144Python Notes: Python 3.11.6Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
High Performance Conjugate Gradient HPCG is the High Performance Conjugate Gradient and is a new scientific benchmark from Sandia National Lans focused for super-computer testing with modern real-world workloads compared to HPCC. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFLOP/s, More Is Better High Performance Conjugate Gradient 3.1 X Y Z: 144 144 144 - RT: 60 GPTshop.ai GH200 10 20 30 40 50 SE +/- 0.29, N = 3 41.69 1. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi
X Y Z: 144 144 144 - RT: 60
HP Z6 G5 A - Threadripper PRO 7995WX: The test quit with a non-zero exit status. E: cat: 'HPCG-Benchmark*.txt': No such file or directory
NAS Parallel Benchmarks NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: BT.C HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 50K 100K 150K 200K 250K SE +/- 492.77, N = 3 SE +/- 150.57, N = 3 213489.54 49381.68 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: CG.C HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 11K 22K 33K 44K 55K SE +/- 557.33, N = 3 SE +/- 54.20, N = 3 50606.21 24046.25 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: FT.C HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 20K 40K 60K 80K 100K SE +/- 62.59, N = 3 SE +/- 82.39, N = 3 100501.03 48109.28 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: IS.D HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 900 1800 2700 3600 4500 SE +/- 18.52, N = 3 SE +/- 1.10, N = 3 4351.68 1748.55 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: LU.C HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 50K 100K 150K 200K 250K SE +/- 1429.28, N = 3 SE +/- 430.26, N = 3 251214.23 39739.62 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: MG.C HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 20K 40K 60K 80K 100K SE +/- 555.59, N = 3 SE +/- 39.48, N = 3 94796.95 58334.53 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: SP.C HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 20K 40K 60K 80K 100K SE +/- 97.08, N = 3 SE +/- 249.16, N = 4 88821.77 21268.70 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5
miniBUDE MiniBUDE is a mini application for the the core computation of the Bristol University Docking Engine (BUDE). This test profile currently makes use of the OpenMP implementation of miniBUDE for CPU benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFInst/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM2 HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 1000 2000 3000 4000 5000 SE +/- 5.10, N = 3 SE +/- 2.95, N = 3 4472.18 1193.12 -march=native -mcpu=native 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -lm
OpenBenchmarking.org Billion Interactions/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM2 HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 40 80 120 160 200 SE +/- 0.20, N = 3 SE +/- 0.12, N = 3 178.89 47.73 -march=native -mcpu=native 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -lm
Rodinia Rodinia is a suite focused upon accelerating compute-intensive applications with accelerators. CUDA, OpenMP, and OpenCL parallel models are supported by the included applications. This profile utilizes select OpenCL, NVIDIA CUDA and OpenMP test binaries at the moment. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenMP LavaMD HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 7 14 21 28 35 SE +/- 0.05, N = 3 SE +/- 0.08, N = 3 26.81 30.31 1. (CXX) g++ options: -O2 -lOpenCL
Algebraic Multi-Grid Benchmark AMG is a parallel algebraic multigrid solver for linear systems arising from problems on unstructured grids. The driver provided with AMG builds linear systems for various 3-dimensional problems. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Figure Of Merit, More Is Better Algebraic Multi-Grid Benchmark 1.2 HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 400M 800M 1200M 1600M 2000M SE +/- 2196017.33, N = 3 SE +/- 16287263.84, N = 9 1669939667 1997929111 1. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -lmpi
libxsmm Libxsmm is an open-source library for specialized dense and sparse matrix operations and deep learning primitives. Libxsmm supports making use of Intel AMX, AVX-512, and other modern CPU instruction set capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 128 HP Z6 G5 A - Threadripper PRO 7995WX 400 800 1200 1600 2000 SE +/- 4.04, N = 3 2039.8 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
M N K: 128
GPTshop.ai GH200: The test quit with a non-zero exit status. E: Error: no specialized routine found!
OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 256 HP Z6 G5 A - Threadripper PRO 7995WX 600 1200 1800 2400 3000 SE +/- 1.59, N = 3 2582.5 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
M N K: 256
GPTshop.ai GH200: The test quit with a non-zero exit status. E: Error: no specialized routine found!
NWChem NWChem is an open-source high performance computational chemistry package. Per NWChem's documentation, "NWChem aims to provide its users with computational chemistry tools that are scalable both in their ability to treat large scientific computational chemistry problems efficiently, and in their use of available parallel computing resources from high-performance parallel supercomputers to conventional workstation clusters." Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better NWChem 7.0.2 Input: C240 Buckyball HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 300 600 900 1200 1500 1601.1 1403.5 -m64 1. (F9X) gfortran options: -lnwctask -lccsd -lmcscf -lselci -lmp2 -lmoints -lstepper -ldriver -loptim -lnwdft -lgradients -lcphf -lesp -lddscf -ldangchang -lguess -lhessian -lvib -lnwcutil -lrimp2 -lproperty -lsolvation -lnwints -lprepar -lnwmd -lnwpw -lofpw -lpaw -lpspw -lband -lnwpwlib -lcafe -lspace -lanalyze -lqhop -lpfft -ldplot -ldrdy -lvscf -lqmmm -lqmd -letrans -ltce -lbq -lmm -lcons -lperfm -ldntmc -lccca -ldimqm -lga -larmci -lpeigs -l64to32 -lopenblas -lpthread -lrt -llapack -lnwcblas -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz -lcomex -ffast-math -std=legacy -fdefault-integer-8 -finline-functions -O2
Xcompact3d Incompact3d Xcompact3d Incompact3d is a Fortran-MPI based, finite difference high-performance code for solving the incompressible Navier-Stokes equation and as many as you need scalar transport equations. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: X3D-benchmarking input.i3d HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 80 160 240 320 400 SE +/- 0.96, N = 3 SE +/- 0.50, N = 3 383.99 254.49 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 193 Cells Per Direction HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 3 6 9 12 15 SE +/- 0.08460429, N = 3 SE +/- 0.02630858, N = 3 10.76054920 9.81172053 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Xmrig Xmrig is an open-source cross-platform CPU/GPU miner for RandomX, KawPow, CryptoNight and AstroBWT. This test profile is setup to measure the Xmlrig CPU mining performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org H/s, More Is Better Xmrig 6.18.1 Variant: Monero - Hash Count: 1M HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 12K 24K 36K 48K 60K SE +/- 161.30, N = 3 SE +/- 9.66, N = 3 56612.2 17253.0 -maes 1. (CXX) g++ options: -fexceptions -fno-rtti -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
OpenBenchmarking.org H/s, More Is Better Xmrig 6.18.1 Variant: Wownero - Hash Count: 1M HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 15K 30K 45K 60K 75K SE +/- 250.10, N = 3 SE +/- 16.77, N = 3 71115.4 21924.1 -maes 1. (CXX) g++ options: -fexceptions -fno-rtti -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: WPA PSK HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 130K 260K 390K 520K 650K SE +/- 2936.87, N = 3 SE +/- 28.67, N = 3 612830 70310 -m64 1. (CC) gcc options: -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt
OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: Blowfish HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 40K 80K 120K 160K 200K SE +/- 163.99, N = 3 SE +/- 24.83, N = 3 172819 69811 -m64 -lgmp -lbz2 1. (CC) gcc options: -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt
OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: MD5 HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 3M 6M 9M 12M 15M SE +/- 110105.00, N = 3 SE +/- 12251.98, N = 3 14692667 1900667 -m64 1. (CC) gcc options: -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: Sharpen HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 300 600 900 1200 1500 SE +/- 1.76, N = 3 SE +/- 10.90, N = 3 824 1363 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -lSM -lICE -lbz2 -lzstd 1. (CC) gcc options: -fopenmp -O2 -ljpeg -lXext -lX11 -llzma -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: Enhanced HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 400 800 1200 1600 2000 SE +/- 6.64, N = 3 SE +/- 0.33, N = 3 1380 1761 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -lSM -lICE -lbz2 -lzstd 1. (CC) gcc options: -fopenmp -O2 -ljpeg -lXext -lX11 -llzma -lz -lm -lpthread
SVT-AV1 OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.7 Encoder Mode: Preset 4 - Input: Bosphorus 4K HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 2 4 6 8 10 SE +/- 0.053, N = 3 SE +/- 0.002, N = 3 7.016 0.826 -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq 1. (CXX) g++ options: -march=native
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.7 Encoder Mode: Preset 8 - Input: Bosphorus 4K HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 30 60 90 120 150 SE +/- 0.98, N = 12 SE +/- 0.03, N = 3 119.98 13.91 -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq 1. (CXX) g++ options: -march=native
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.7 Encoder Mode: Preset 12 - Input: Bosphorus 4K HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 50 100 150 200 250 SE +/- 1.23, N = 3 SE +/- 0.00, N = 3 206.13 31.47 -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq 1. (CXX) g++ options: -march=native
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.7 Encoder Mode: Preset 13 - Input: Bosphorus 4K HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 40 80 120 160 200 SE +/- 2.01, N = 3 SE +/- 0.00, N = 3 202.28 31.50 -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq 1. (CXX) g++ options: -march=native
OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 22.01 Test: Decompression Rating HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 140K 280K 420K 560K 700K SE +/- 3067.65, N = 3 SE +/- 229.99, N = 3 658991 389055 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
Stockfish This is a test of Stockfish, an advanced open-source C++11 chess benchmark that can scale up to 512 CPU threads. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 15 Total Time HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 60M 120M 180M 240M 300M SE +/- 3424910.11, N = 3 SE +/- 3694553.52, N = 12 285651359 153826682 -m64 -msse -msse3 -mpopcnt -mavx2 -mavx512f -mavx512bw -mavx512vnni -mavx512dq -mavx512vl -msse4.1 -mssse3 -msse2 -mbmi2 1. (CXX) g++ options: -lgcov -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -flto -flto=jobserver
Primesieve Primesieve generates prime numbers using a highly optimized sieve of Eratosthenes implementation. Primesieve primarily benchmarks the CPU's L1/L2 cache performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 8.0 Length: 1e13 HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 8 16 24 32 40 SE +/- 0.20, N = 3 SE +/- 0.33, N = 3 23.29 35.49 1. (CXX) g++ options: -O3
rays1bench This is a test of rays1bench, a simple path-tracer / ray-tracing that supports SSE and AVX instructions, multi-threading, and other features. This test profile is measuring the performance of the "large scene" in rays1bench. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org mrays/s, More Is Better rays1bench 2020-01-09 Large Scene HP Z6 G5 A - Threadripper PRO 7995WX 130 260 390 520 650 SE +/- 2.65, N = 3 582.32
Large Scene
GPTshop.ai GH200: The test quit with a non-zero exit status. E: FileNotFoundError: [Errno 2] No such file or directory: './a.out'
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU HP Z6 G5 A - Threadripper PRO 7995WX 0.0808 0.1616 0.2424 0.3232 0.404 SE +/- 0.001912, N = 3 0.358902 MIN: 0.32 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU
GPTshop.ai GH200: The test run did not produce a result.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU HP Z6 G5 A - Threadripper PRO 7995WX 0.2831 0.5662 0.8493 1.1324 1.4155 SE +/- 0.01004, N = 3 1.25835 MIN: 1.01 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU
GPTshop.ai GH200: The test run did not produce a result.
Helsing Helsing is an open-source POSIX vampire number generator. This test profile measures the time it takes to generate vampire numbers between varying numbers of digits. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Helsing 1.0-beta Digit Range: 14 digit HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 15 30 45 60 75 SE +/- 0.11, N = 3 SE +/- 0.52, N = 10 55.90 67.61 1. (CC) gcc options: -O2 -pthread
Tachyon This is a test of the threaded Tachyon, a parallel ray-tracing system, measuring the time to ray-trace a sample scene. The sample scene used is the Teapot scene ray-traced to 8K x 8K with 32 samples. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Tachyon 0.99.2 Total Time HP Z6 G5 A - Threadripper PRO 7995WX 4 8 12 16 20 SE +/- 0.03, N = 3 16.05 1. (CC) gcc options: -m64 -O3 -fomit-frame-pointer -ffast-math -ltachyon -lm -lpthread
Total Time
GPTshop.ai GH200: The test run did not produce a result. E: ./tachyon-benchmark: 3: ./tachyon: not found
Cpuminer-Opt Cpuminer-Opt is a fork of cpuminer-multi that carries a wide range of CPU performance optimizations for measuring the potential cryptocurrency mining performance of the CPU/processor with a wide variety of cryptocurrencies. The benchmark reports the hash speed for the CPU mining performance for the selected cryptocurrency. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 23.5 Algorithm: Deepcoin HP Z6 G5 A - Threadripper PRO 7995WX 7K 14K 21K 28K 35K SE +/- 357.43, N = 5 33632 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
Algorithm: Deepcoin
GPTshop.ai GH200: The test quit with a non-zero exit status. E: ./cpuminer-opt: 3: ./cpuminer: not found
OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 23.5 Algorithm: Blake-2 S HP Z6 G5 A - Threadripper PRO 7995WX 120K 240K 360K 480K 600K SE +/- 1705.31, N = 3 560763 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
Algorithm: Blake-2 S
GPTshop.ai GH200: The test quit with a non-zero exit status. E: ./cpuminer-opt: 3: ./cpuminer: not found
OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 23.5 Algorithm: Myriad-Groestl HP Z6 G5 A - Threadripper PRO 7995WX 10K 20K 30K 40K 50K SE +/- 64.29, N = 3 44940 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
Algorithm: Myriad-Groestl
GPTshop.ai GH200: The test quit with a non-zero exit status. E: ./cpuminer-opt: 3: ./cpuminer: not found
OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 23.5 Algorithm: Triple SHA-256, Onecoin HP Z6 G5 A - Threadripper PRO 7995WX 90K 180K 270K 360K 450K SE +/- 2385.23, N = 3 401393 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
Algorithm: Triple SHA-256, Onecoin
GPTshop.ai GH200: The test quit with a non-zero exit status. E: ./cpuminer-opt: 3: ./cpuminer: not found
Liquid-DSP LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 128 - Buffer Length: 256 - Filter Length: 512 HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 300M 600M 900M 1200M 1500M SE +/- 6716728.70, N = 3 SE +/- 271804.67, N = 3 1306966667 229866667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 240 - Buffer Length: 256 - Filter Length: 512 HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 300M 600M 900M 1200M 1500M SE +/- 7410877.89, N = 3 SE +/- 501741.41, N = 3 1530466667 237746667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
ASKAP ASKAP is a set of benchmarks from the Australian SKA Pathfinder. The principal ASKAP benchmarks are the Hogbom Clean Benchmark (tHogbomClean) and Convolutional Resamping Benchmark (tConvolve) as well as some previous ASKAP benchmarks being included as well for OpenCL and CUDA execution of tConvolve. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Degridding HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 9K 18K 27K 36K 45K SE +/- 524.43, N = 3 SE +/- 21.70, N = 3 40906.5 12407.6 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Gridding HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 9K 18K 27K 36K 45K SE +/- 527.54, N = 3 SE +/- 13.17, N = 3 43543.90 9652.19 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASTC Encoder ASTC Encoder (astcenc) is for the Adaptive Scalable Texture Compression (ASTC) format commonly used with OpenGL, OpenGL ES, and Vulkan graphics APIs. This test profile does a coding test of both compression/decompression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.0 Preset: Medium HP Z6 G5 A - Threadripper PRO 7995WX 100 200 300 400 500 SE +/- 1.12, N = 3 443.46 1. (CXX) g++ options: -O3 -flto -pthread
Preset: Medium
GPTshop.ai GH200: The test quit with a non-zero exit status. E: ./astcenc: 2: ./astc-encoder-4.0.0/build/Source/astcenc-neon: not found
OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.0 Preset: Thorough HP Z6 G5 A - Threadripper PRO 7995WX 14 28 42 56 70 SE +/- 0.18, N = 3 62.64 1. (CXX) g++ options: -O3 -flto -pthread
Preset: Thorough
GPTshop.ai GH200: The test quit with a non-zero exit status. E: ./astcenc: 2: ./astc-encoder-4.0.0/build/Source/astcenc-neon: not found
OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.0 Preset: Exhaustive HP Z6 G5 A - Threadripper PRO 7995WX 2 4 6 8 10 SE +/- 0.0214, N = 3 6.6196 1. (CXX) g++ options: -O3 -flto -pthread
Preset: Exhaustive
GPTshop.ai GH200: The test quit with a non-zero exit status. E: ./astcenc: 2: ./astc-encoder-4.0.0/build/Source/astcenc-neon: not found
Graph500 This is a benchmark of the reference implementation of Graph500, an HPC benchmark focused on data intensive loads and commonly tested on supercomputers for complex data problems. Graph500 primarily stresses the communication subsystem of the hardware under test. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org bfs median_TEPS, More Is Better Graph500 3.0 Scale: 26 HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 300M 600M 900M 1200M 1500M 634702000 1249790000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
OpenBenchmarking.org bfs max_TEPS, More Is Better Graph500 3.0 Scale: 26 HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 300M 600M 900M 1200M 1500M 647711000 1315650000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
OpenBenchmarking.org sssp median_TEPS, More Is Better Graph500 3.0 Scale: 26 HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 70M 140M 210M 280M 350M 317931000 299027000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
OpenBenchmarking.org sssp max_TEPS, More Is Better Graph500 3.0 Scale: 26 HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 100M 200M 300M 400M 500M 399994000 467012000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
GROMACS The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2023 Implementation: MPI CPU - Input: water_GMX50_bare HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 3 6 9 12 15 SE +/- 0.389, N = 9 SE +/- 0.002, N = 3 10.314 5.194 1. (CXX) g++ options: -O3
DuckDB DuckDB is an in-progress SQL OLAP database management system optimized for analytics and features a vectorized and parallel engine. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better DuckDB 0.9.1 Benchmark: IMDB HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 20 40 60 80 100 SE +/- 0.12, N = 3 SE +/- 0.70, N = 3 104.28 92.08 1. (CXX) g++ options: -O3 -rdynamic -lssl -lcrypto -ldl
OpenBenchmarking.org Seconds, Fewer Is Better DuckDB 0.9.1 Benchmark: TPC-H Parquet HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 30 60 90 120 150 SE +/- 0.09, N = 3 SE +/- 1.06, N = 3 120.76 148.76 1. (CXX) g++ options: -O3 -rdynamic -lssl -lcrypto -ldl
PostgreSQL This is a benchmark of PostgreSQL using the integrated pgbench for facilitating the database benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org TPS, More Is Better PostgreSQL 16 Scaling Factor: 100 - Clients: 1000 - Mode: Read Write HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 12K 24K 36K 48K 60K SE +/- 176.81, N = 3 SE +/- 783.91, N = 11 16124 54975 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 16 Scaling Factor: 100 - Clients: 1000 - Mode: Read Write - Average Latency HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 14 28 42 56 70 SE +/- 0.68, N = 3 SE +/- 0.28, N = 11 62.03 18.23 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Matrix Math HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 110K 220K 330K 440K 550K SE +/- 1114.33, N = 3 SE +/- 3363.22, N = 3 430429.52 512759.08 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Vector Math HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 130K 260K 390K 520K 650K SE +/- 183.75, N = 3 SE +/- 38.26, N = 3 619071.09 359058.52 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: AVX-512 VNNI HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 2M 4M 6M 8M 10M SE +/- 4488.69, N = 3 SE +/- 21528.43, N = 3 9099613.65 4173580.13 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Floating Point HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 7K 14K 21K 28K 35K SE +/- 30.90, N = 3 SE +/- 4.82, N = 3 30460.61 20137.35 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Matrix 3D Math HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 4K 8K 12K 16K 20K SE +/- 2.39, N = 3 SE +/- 12.40, N = 3 9100.74 17483.02 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Memory Copying HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 6K 12K 18K 24K 30K SE +/- 21.26, N = 3 SE +/- 78.88, N = 3 27883.51 27182.72 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Wide Vector Math HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 600K 1200K 1800K 2400K 3000K SE +/- 3634.23, N = 3 SE +/- 16548.38, N = 15 2721964.70 2002466.67 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Fused Multiply-Add HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 40M 80M 120M 160M 200M SE +/- 157257.42, N = 3 SE +/- 1701013.93, N = 3 206576021.93 139525267.41 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Vector Floating Point HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 50K 100K 150K 200K 250K SE +/- 135.17, N = 3 SE +/- 87.06, N = 3 219092.94 103967.93 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OpenVINO This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Face Detection FP16 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 11 22 33 44 55 SE +/- 0.07, N = 3 SE +/- 0.05, N = 3 48.01 7.46 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Face Detection FP16 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 200 400 600 800 1000 SE +/- 0.98, N = 3 SE +/- 0.82, N = 3 993.05 134.06 MIN: 482.47 / MAX: 1144.59 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Person Detection FP16 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 70 140 210 280 350 SE +/- 0.25, N = 3 SE +/- 0.19, N = 15 340.49 19.06 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Person Detection FP16 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 30 60 90 120 150 SE +/- 0.12, N = 3 SE +/- 0.52, N = 15 140.80 52.51 MIN: 52.25 / MAX: 220.24 MIN: 45.64 / MAX: 88.61 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Person Detection FP32 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 70 140 210 280 350 SE +/- 0.34, N = 3 SE +/- 0.22, N = 4 341.45 19.16 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Person Detection FP32 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 30 60 90 120 150 SE +/- 0.13, N = 3 SE +/- 0.59, N = 4 140.45 52.20 MIN: 52.13 / MAX: 230.35 MIN: 46.86 / MAX: 83.61 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Vehicle Detection FP16 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 700 1400 2100 2800 3500 SE +/- 4.78, N = 3 SE +/- 0.97, N = 3 3237.40 124.47 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Vehicle Detection FP16 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 14.81 8.02 MIN: 6.08 / MAX: 45.26 MIN: 6.08 / MAX: 18.7 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Face Detection FP16-INT8 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 20 40 60 80 100 SE +/- 0.05, N = 3 SE +/- 0.00, N = 3 94.94 1.01 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Face Detection FP16-INT8 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 200 400 600 800 1000 SE +/- 0.20, N = 3 SE +/- 4.42, N = 3 503.91 993.52 MIN: 253.56 / MAX: 608.11 MIN: 966.61 / MAX: 1035.71 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Face Detection Retail FP16 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 2K 4K 6K 8K 10K SE +/- 20.00, N = 3 SE +/- 0.31, N = 3 11647.89 184.36 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Face Detection Retail FP16 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 1.2173 2.4346 3.6519 4.8692 6.0865 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 4.11 5.41 MIN: 2.29 / MAX: 22.1 MIN: 3.66 / MAX: 12.71 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Road Segmentation ADAS FP16 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 400 800 1200 1600 2000 SE +/- 1.75, N = 3 SE +/- 0.39, N = 15 1712.72 44.58 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Road Segmentation ADAS FP16 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 7 14 21 28 35 SE +/- 0.03, N = 3 SE +/- 0.19, N = 15 28.00 22.44 MIN: 18.2 / MAX: 67.59 MIN: 18.38 / MAX: 33.33 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Vehicle Detection FP16-INT8 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 1200 2400 3600 4800 6000 SE +/- 16.25, N = 3 SE +/- 0.08, N = 3 5727.47 9.94 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Vehicle Detection FP16-INT8 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 20 40 60 80 100 SE +/- 0.02, N = 3 SE +/- 0.77, N = 3 8.37 100.62 MIN: 95.54 / MAX: 114.29 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Weld Porosity Detection FP16 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 1000 2000 3000 4000 5000 SE +/- 6.25, N = 3 SE +/- 1.58, N = 3 4793.80 285.00 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Weld Porosity Detection FP16 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 5 10 15 20 25 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 20.01 3.50 MIN: 10.21 / MAX: 40.65 MIN: 2.11 / MAX: 13.96 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Face Detection Retail FP16-INT8 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 4K 8K 12K 16K 20K SE +/- 38.34, N = 3 SE +/- 0.29, N = 5 16968.10 26.53 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Face Detection Retail FP16-INT8 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 9 18 27 36 45 SE +/- 0.01, N = 3 SE +/- 0.42, N = 5 5.64 37.70 MIN: 3.25 / MAX: 24.09 MIN: 34.56 / MAX: 47.33 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Road Segmentation ADAS FP16-INT8 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 400 800 1200 1600 2000 SE +/- 2.05, N = 3 SE +/- 0.00, N = 3 2039.24 3.24 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Road Segmentation ADAS FP16-INT8 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 70 140 210 280 350 SE +/- 0.02, N = 3 SE +/- 0.47, N = 3 23.52 308.42 MIN: 299.67 / MAX: 324.14 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Machine Translation EN To DE FP16 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 120 240 360 480 600 SE +/- 0.29, N = 3 SE +/- 0.26, N = 3 555.21 23.78 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Machine Translation EN To DE FP16 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 20 40 60 80 100 SE +/- 0.05, N = 3 SE +/- 0.47, N = 3 86.38 42.05 MIN: 51.34 / MAX: 158.07 MIN: 37.47 / MAX: 166.36 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Weld Porosity Detection FP16-INT8 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 2K 4K 6K 8K 10K SE +/- 38.08, N = 3 SE +/- 0.58, N = 3 9585.34 76.44 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Weld Porosity Detection FP16-INT8 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 3 6 9 12 15 SE +/- 0.04, N = 3 SE +/- 0.10, N = 3 10.00 13.07 MIN: 5.34 / MAX: 28.56 MIN: 11.41 / MAX: 24.95 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Person Vehicle Bike Detection FP16 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 1200 2400 3600 4800 6000 SE +/- 13.90, N = 3 SE +/- 0.33, N = 3 5467.67 26.68 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Person Vehicle Bike Detection FP16 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 9 18 27 36 45 SE +/- 0.02, N = 3 SE +/- 0.46, N = 3 8.76 37.48 MIN: 6.07 / MAX: 28.47 MIN: 33.64 / MAX: 48.73 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Handwritten English Recognition FP16 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 500 1000 1500 2000 2500 SE +/- 9.61, N = 3 SE +/- 0.04, N = 3 2541.51 4.33 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Handwritten English Recognition FP16 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 50 100 150 200 250 SE +/- 0.14, N = 3 SE +/- 2.13, N = 3 37.75 231.18 MIN: 23.66 / MAX: 98.37 MIN: 220.77 / MAX: 317.5 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 20K 40K 60K 80K 100K SE +/- 300.34, N = 3 SE +/- 8.02, N = 3 83726.28 781.77 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 0.2858 0.5716 0.8574 1.1432 1.429 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 0.90 1.27 MIN: 0.25 / MAX: 18.19 MIN: 0.64 / MAX: 3.24 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Handwritten English Recognition FP16-INT8 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 500 1000 1500 2000 2500 SE +/- 35.73, N = 12 SE +/- 0.01, N = 3 2141.63 4.16 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Handwritten English Recognition FP16-INT8 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 50 100 150 200 250 SE +/- 0.90, N = 12 SE +/- 0.76, N = 3 44.96 240.51 MIN: 30.33 / MAX: 84.49 MIN: 230.79 / MAX: 323.15 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 20K 40K 60K 80K 100K SE +/- 398.68, N = 3 SE +/- 3.96, N = 15 107003.12 595.73 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU HP Z6 G5 A - Threadripper PRO 7995WX GPTshop.ai GH200 0.3758 0.7516 1.1274 1.5032 1.879 SE +/- 0.00, N = 3 SE +/- 0.01, N = 15 0.65 1.67 MIN: 0.21 / MAX: 20.43 MIN: 1.08 / MAX: 4.63 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
GPTshop.ai GH200 Processor: ARMv8 Neoverse-V2 @ 3.39GHz (72 Cores), Motherboard: Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS), Memory: 1 x 480GB DRAM-6400MT/s, Disk: 960GB SAMSUNG MZ1L2960HCJR-00A07 + 1920GB SAMSUNG MZTL21T9, Graphics: ASPEED, Network: 2 x Mellanox MT2910 + 2 x QLogic FastLinQ QL41000 10/25/40/50GbE
OS: Ubuntu 23.10, Kernel: 6.5.0-15-generic (aarch64), Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1920x1200
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -vProcessor Notes: Scaling Governor: cppc_cpufreq performance (Boost: Disabled)Python Notes: Python 3.11.6Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 5 February 2024 20:48 by user x.
HP Z6 G5 A - Threadripper PRO 7995WX Processor: AMD Ryzen Threadripper PRO 7995WX 96-Cores @ 6.44GHz (96 Cores / 192 Threads), Motherboard: HP Z6 G5 A Workstation 8B24 (U65 Ver. 01.01.04 BIOS), Chipset: AMD Device 14a4, Memory: 8 x 16GB DRAM-5200MT/s Hynix HMCG78AGBRA190N, Disk: 2 x 1024GB SAMSUNG MZVL21T0HCLR-00BH1, Graphics: NVIDIA RTX A4000 16GB, Audio: NVIDIA GA104 HD Audio, Monitor: ASUS VP28U, Network: Realtek RTL8111/8168/8411
OS: Ubuntu 23.10, Kernel: 6.5.0-17-generic (x86_64), Desktop: GNOME Shell 45.2, Display Server: X Server 1.21.1.7, Display Driver: NVIDIA 535.154.05, OpenGL: 4.6.0, OpenCL: OpenCL 3.0 CUDA 12.2.148, Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 3840x2160
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: amd-pstate-epp performance (EPP: performance) - CPU Microcode: 0xa108105OpenCL Notes: GPU Compute Cores: 6144Python Notes: Python 3.11.6Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 17 February 2024 17:07 by user phoronix.