AMD Ryzen 9 9950X 16-Core testing with a ASUS ROG STRIX X670E-E GAMING WIFI (2204 BIOS) and AMD Radeon RX 7900 GRE 16GB on Ubuntu 24.04 via the Phoronix Test Suite.
2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G Processor: AMD Ryzen 9 9950X 16-Core @ 5.75GHz (16 Cores / 32 Threads), Motherboard: ASUS ROG STRIX X670E-E GAMING WIFI (2204 BIOS), Chipset: AMD Device 14d8, Memory: 2 x 16GB DDR5-6000MT/s G Skill F5-6000J3038F16G, Disk: 2000GB Corsair MP700 PRO, Graphics: AMD Radeon RX 7900 GRE 16GB, Audio: AMD Navi 31 HDMI/DP, Monitor: DELL U2723QE, Network: Intel I225-V + Intel Wi-Fi 6E
OS: Ubuntu 24.04, Kernel: 6.10.0-phx (x86_64), Desktop: GNOME Shell 46.0, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 24.2~git2406040600.8112d4~oibaf~n (git-8112d44 2024-06-04 noble-oibaf-ppa) (LLVM 17.0.6 DRM 3.57), Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 3840x2160
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xb40401aPython Notes: Python 3.12.3Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 256 - Model: ResNet-50 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 13 26 39 52 65 SE +/- 0.34, N = 3 57.54 MIN: 53.14 / MAX: 58.34
Graph500 This is a benchmark of the reference implementation of Graph500, an HPC benchmark focused on data intensive loads and commonly tested on supercomputers for complex data problems. Graph500 primarily stresses the communication subsystem of the hardware under test. Learn more via the OpenBenchmarking.org test page.
Scale: 26
2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G: The test quit with a non-zero exit status. E: mpirun noticed that process rank 14 with PID 0 on node phoronix-System-Product-Name exited on signal 9 (Killed).
miniBUDE MiniBUDE is a mini application for the the core computation of the Bristol University Docking Engine (BUDE). This test profile currently makes use of the OpenMP implementation of miniBUDE for CPU benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Billion Interactions/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM1 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 20 40 60 80 100 SE +/- 0.27, N = 3 76.89 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
OpenBenchmarking.org Billion Interactions/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM2 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 20 40 60 80 100 SE +/- 0.03, N = 3 77.39 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
Stress-NG OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.17.08 Test: Memory Copying 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 2K 4K 6K 8K 10K SE +/- 44.94, N = 3 10752.81 1. (CXX) g++ options: -lm -lapparmor -latomic -lcrypt -ldl -ljpeg -lEGL -lGLESv2 -lgmp -lgbm -lmpfr -lsctp -lz -lrt -lpthread -lc -std=gnu99 -O2 -U_FORTIFY_SOURCE
Quicksilver Quicksilver is a proxy application that represents some elements of the Mercury workload by solving a simplified dynamic Monte Carlo particle transport problem. Quicksilver is developed by Lawrence Livermore National Laboratory (LLNL) and this test profile currently makes use of the OpenMP CPU threaded code path. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Figure Of Merit, More Is Better Quicksilver 20230818 Input: CORAL2 P1 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 5M 10M 15M 20M 25M SE +/- 21858.13, N = 3 25676667 1. (CXX) g++ options: -fopenmp -O3 -march=native
Embree Intel Embree is a collection of high-performance ray-tracing kernels for execution on CPUs (and GPUs via SYCL) and supporting instruction sets such as SSE, AVX, AVX2, and AVX-512. Embree also supports making use of the Intel SPMD Program Compiler (ISPC). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer ISPC - Model: Crown 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 8 16 24 32 40 SE +/- 0.04, N = 3 36.55 MIN: 36.13 / MAX: 37.34
OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer ISPC - Model: Asian Dragon 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 9 18 27 36 45 SE +/- 0.03, N = 3 41.49 MIN: 41.24 / MAX: 42.02
x265 OpenBenchmarking.org Frames Per Second, More Is Better x265 Video Input: Bosphorus 4K 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 9 18 27 36 45 SE +/- 0.04, N = 3 38.27 1. x265 [info]: HEVC encoder version 3.5+1-f0c1022b6
OpenBenchmarking.org Frames Per Second, More Is Better x265 Video Input: Bosphorus 1080p 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 30 60 90 120 150 SE +/- 0.27, N = 3 134.52 1. x265 [info]: HEVC encoder version 3.5+1-f0c1022b6
miniBUDE MiniBUDE is a mini application for the the core computation of the Bristol University Docking Engine (BUDE). This test profile currently makes use of the OpenMP implementation of miniBUDE for CPU benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFInst/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM1 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 400 800 1200 1600 2000 SE +/- 6.84, N = 3 1922.13 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
OpenBenchmarking.org GFInst/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM2 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 400 800 1200 1600 2000 SE +/- 0.64, N = 3 1934.61 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
X Y Z: 144 144 144 - RT: 60
2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G: The test quit with a non-zero exit status. E: cat: 'HPCG-Benchmark*.txt': No such file or directory
TensorFlow This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 1 - Model: ResNet-50 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 4 8 12 16 20 SE +/- 0.03, N = 3 16.92
LuxCoreRender LuxCoreRender is an open-source 3D physically based renderer formerly known as LuxRender. LuxCoreRender supports CPU-based rendering as well as GPU acceleration via OpenCL, NVIDIA CUDA, and NVIDIA OptiX interfaces. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.6 Scene: DLSC - Acceleration: CPU 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 1.215 2.43 3.645 4.86 6.075 SE +/- 0.01, N = 3 5.40 MIN: 5.27 / MAX: 5.72
OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.6 Scene: Danish Mood - Acceleration: CPU 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 1.0395 2.079 3.1185 4.158 5.1975 SE +/- 0.01, N = 3 4.62 MIN: 2.07 / MAX: 5.22
OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.6 Scene: Orange Juice - Acceleration: CPU 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 2 4 6 8 10 SE +/- 0.01, N = 3 8.64 MIN: 7.64 / MAX: 9.25
OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.6 Scene: LuxCore Benchmark - Acceleration: CPU 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 1.1588 2.3176 3.4764 4.6352 5.794 SE +/- 0.01, N = 3 5.15 MIN: 2.38 / MAX: 5.77
OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.6 Scene: Rainbow Colors and Prism - Acceleration: CPU 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 5 10 15 20 25 SE +/- 0.07, N = 3 19.99 MIN: 18.05 / MAX: 20.42
OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Copy - Benchmark: Integer 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 14K 28K 42K 56K 70K SE +/- 820.93, N = 3 65797.05 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Scale - Benchmark: Integer 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 15K 30K 45K 60K 75K SE +/- 220.54, N = 3 69141.08 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Triad - Benchmark: Integer 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 13K 26K 39K 52K 65K SE +/- 156.92, N = 3 62428.70 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Average - Benchmark: Integer 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 14K 28K 42K 56K 70K SE +/- 119.74, N = 3 65503.93 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy - Array Size: 8192 MiB 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 5K 10K 15K 20K 25K SE +/- 49.10, N = 3 22247.78 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy, Fixed Block Size - Array Size: 4096 MiB 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 4K 8K 12K 16K 20K SE +/- 24.77, N = 3 19534.16 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy, Fixed Block Size - Array Size: 8192 MiB 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 4K 8K 12K 16K 20K SE +/- 10.56, N = 3 19467.39 1. (CC) gcc options: -O3 -march=native
7-Zip Compression OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression Test: Compression Rating 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 40K 80K 120K 160K 200K SE +/- 322.74, N = 3 195585 1. 7-Zip 23.01 (x64) : Copyright (c) 1999-2023 Igor Pavlov : 2023-06-20
OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression Test: Decompression Rating 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 30K 60K 90K 120K 150K SE +/- 14.15, N = 3 158133 1. 7-Zip 23.01 (x64) : Copyright (c) 1999-2023 Igor Pavlov : 2023-06-20
Etcpak OpenBenchmarking.org Mpx/s, More Is Better Etcpak 2.0 Benchmark: Multi-Threaded - Configuration: ETC2 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 160 320 480 640 800 SE +/- 1.25, N = 3 741.78 1. (CXX) g++ options: -flto -pthread
LeelaChessZero OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.31.1 Backend: BLAS 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 50 100 150 200 250 SE +/- 2.65, N = 3 226 1. (CXX) g++ options: -flto -pthread
Stockfish OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish Chess Benchmark 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 11M 22M 33M 44M 55M SE +/- 491754.88, N = 15 51377778 1. Stockfish 16 by the Stockfish developers (see AUTHORS file)
GROMACS The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2024 Implementation: MPI CPU - Input: water_GMX50_bare 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 0.7067 1.4134 2.1201 2.8268 3.5335 SE +/- 0.001, N = 3 3.141 1. (CXX) g++ options: -O3 -lm
OpenBenchmarking.org Ns Per Day, More Is Better GROMACS Input: water_GMX50_bare 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 0.4336 0.8672 1.3008 1.7344 2.168 SE +/- 0.002, N = 3 1.927 1. GROMACS version: 2023.3-Ubuntu_2023.3_1ubuntu3
NAMD OpenBenchmarking.org ns/day, More Is Better NAMD 3.0b6 Input: ATPase with 327,506 Atoms 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 0.7673 1.5346 2.3019 3.0692 3.8365 SE +/- 0.02586, N = 3 3.41007
OpenBenchmarking.org ns/day, More Is Better NAMD 3.0b6 Input: STMV with 1,066,628 Atoms 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 0.2207 0.4414 0.6621 0.8828 1.1035 SE +/- 0.00027, N = 3 0.98084
Memcached Memcached is a high performance, distributed memory object caching system. This Memcached test profiles makes use of memtier_benchmark for excuting this CPU/memory-focused server benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.19 Set To Get Ratio: 1:10 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 1.3M 2.6M 3.9M 5.2M 6.5M SE +/- 15831.56, N = 3 5915791.34 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.19 Set To Get Ratio: 1:100 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 1.6M 3.2M 4.8M 6.4M 8M SE +/- 1150.91, N = 3 7507615.73 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
Llama.cpp OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b3067 Model: Meta-Llama-3-8B-Instruct-Q8_0.gguf 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 2 4 6 8 10 SE +/- 0.01, N = 3 8.36 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas
Llamafile Test: llava-v1.6-mistral-7b.Q8_0 - Acceleration: CPU
2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G: The test quit with a non-zero exit status. E: ./run-llava: line 2: ./llava-v1.6-mistral-7b.Q8_0.llamafile.86: No such file or directory
OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.6 Test: Meta-Llama-3-8B-Instruct.F16 - Acceleration: CPU 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 1.053 2.106 3.159 4.212 5.265 4.68
Test: TinyLlama-1.1B-Chat-v1.0.BF16 - Acceleration: CPU
2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G: The test run did not produce a result. E: sh: 1: exec: ./llamafile: not found
Test: mistral-7b-instruct-v0.2.Q5_K_M - Acceleration: CPU
2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G: The test run did not produce a result. E: sh: 1: exec: ./llamafile: not found
NAS Parallel Benchmarks NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: BT.C 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 12K 24K 36K 48K 60K SE +/- 54.79, N = 3 55287.37 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: CG.C 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 2K 4K 6K 8K 10K SE +/- 21.24, N = 3 11144.88 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.C 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 800 1600 2400 3200 4000 SE +/- 25.12, N = 3 3678.46 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.D 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 800 1600 2400 3200 4000 SE +/- 2.85, N = 3 3823.80 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: FT.C 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 6K 12K 18K 24K 30K SE +/- 141.73, N = 3 27106.18 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: IS.D 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 300 600 900 1200 1500 SE +/- 5.33, N = 3 1505.47 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: LU.C 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 13K 26K 39K 52K 65K SE +/- 150.93, N = 3 59920.96 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: MG.C 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 5K 10K 15K 20K 25K SE +/- 6.88, N = 3 24236.21 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: SP.B 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 5K 10K 15K 20K 25K SE +/- 21.50, N = 3 21328.28 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: SP.C 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 3K 6K 9K 12K 15K SE +/- 18.27, N = 3 14528.26 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
BRL-CAD BRL-CAD is a cross-platform, open-source solid modeling system with built-in benchmark mode. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.38.2 VGR Performance Metric 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 110K 220K 330K 440K 550K 498966 1. (CXX) g++ options: -std=c++17 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -ltcl8.6 -lnetpbm -lregex_brl -lz_brl -lassimp -ldl -lm -ltk8.6
NWChem NWChem is an open-source high performance computational chemistry package. Per NWChem's documentation, "NWChem aims to provide its users with computational chemistry tools that are scalable both in their ability to treat large scientific computational chemistry problems efficiently, and in their use of available parallel computing resources from high-performance parallel supercomputers to conventional workstation clusters." Learn more via the OpenBenchmarking.org test page.
Input: C240 Buckyball
2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G: The test run did not produce a result.
Xcompact3d Incompact3d Xcompact3d Incompact3d is a Fortran-MPI based, finite difference high-performance code for solving the incompressible Navier-Stokes equation and as many as you need scalar transport equations. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 129 Cells Per Direction 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 4 8 12 16 20 SE +/- 0.08, N = 3 14.09 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 193 Cells Per Direction 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 14 28 42 56 70 SE +/- 0.09, N = 3 64.11 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenFOAM OpenFOAM is the leading free, open-source software for computational fluid dynamics (CFD). This test profile currently uses the drivaerFastback test case for analyzing automotive aerodynamics or alternatively the older motorBike input. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: motorBike - Mesh Time 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 20 40 60 80 100 84.99 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm
OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: motorBike - Execution Time 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 13 26 39 52 65 58.57 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm
OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Mesh Time 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 5 10 15 20 25 22.58 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm
OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Execution Time 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 30 60 90 120 150 157.15 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm
OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Medium Mesh Size - Mesh Time 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 40 80 120 160 200 181.57 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm
OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Medium Mesh Size - Execution Time 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 500 1000 1500 2000 2500 2096.57 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm
OpenRadioss OpenRadioss is an open-source AGPL-licensed finite element solver for dynamic event analysis OpenRadioss is based on Altair Radioss and open-sourced in 2022. This open-source finite element solver is benchmarked with various example models available from https://www.openradioss.org/models/ and https://github.com/OpenRadioss/ModelExchange/tree/main/Examples. This test is currently using a reference OpenRadioss binary build offered via GitHub. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2023.09.15 Model: Bumper Beam 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 20 40 60 80 100 SE +/- 0.07, N = 3 76.90
SPECFEM3D simulates acoustic (fluid), elastic (solid), coupled acoustic/elastic, poroelastic or seismic wave propagation in any type of conforming mesh of hexahedra. This test profile currently relies on CPU-based execution for SPECFEM3D and using a variety of their built-in examples/models for benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.1.1 Model: Mount St. Helens 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 6 12 18 24 30 SE +/- 0.17, N = 3 25.69 1. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.1.1 Model: Layered Halfspace 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 16 32 48 64 80 SE +/- 0.31, N = 3 73.14 1. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.1.1 Model: Tomographic Model 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 6 12 18 24 30 SE +/- 0.16, N = 3 23.36 1. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.1.1 Model: Homogeneous Halfspace 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 7 14 21 28 35 SE +/- 0.28, N = 3 30.68 1. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.1.1 Model: Water-layered Halfspace 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 16 32 48 64 80 SE +/- 0.34, N = 3 72.20 1. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
POV-Ray OpenBenchmarking.org Seconds, Fewer Is Better POV-Ray Trace Time 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 4 8 12 16 20 SE +/- 0.03, N = 3 16.10 1. POV-Ray 3.7.0.10.unofficial
Blender OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.2 Blend File: BMW27 - Compute: CPU-Only 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 11 22 33 44 55 SE +/- 0.01, N = 3 46.38
OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.2 Blend File: Junkshop - Compute: CPU-Only 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 14 28 42 56 70 SE +/- 0.12, N = 3 62.09
OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.2 Blend File: Barbershop - Compute: CPU-Only 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 100 200 300 400 500 SE +/- 0.20, N = 3 448.96
XNNPACK OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: FP32MobileNetV2 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 300 600 900 1200 1500 SE +/- 5.21, N = 3 1264 1. (CXX) g++ options: -O3 -lrt -lm
OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: FP32MobileNetV3Large 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 300 600 900 1200 1500 SE +/- 19.19, N = 3 1516 1. (CXX) g++ options: -O3 -lrt -lm
OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: FP32MobileNetV3Small 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 200 400 600 800 1000 SE +/- 2.31, N = 3 812 1. (CXX) g++ options: -O3 -lrt -lm
OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: FP16MobileNetV2 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 200 400 600 800 1000 SE +/- 2.31, N = 3 985 1. (CXX) g++ options: -O3 -lrt -lm
OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: FP16MobileNetV3Large 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 300 600 900 1200 1500 SE +/- 2.89, N = 3 1291 1. (CXX) g++ options: -O3 -lrt -lm
OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: FP16MobileNetV3Small 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 170 340 510 680 850 SE +/- 1.76, N = 3 770 1. (CXX) g++ options: -O3 -lrt -lm
OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: QU8MobileNetV2 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 160 320 480 640 800 SE +/- 0.33, N = 3 759 1. (CXX) g++ options: -O3 -lrt -lm
OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: QU8MobileNetV3Large 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 200 400 600 800 1000 SE +/- 1.20, N = 3 1118 1. (CXX) g++ options: -O3 -lrt -lm
OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: QU8MobileNetV3Small 2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G 160 320 480 640 800 756 1. (CXX) g++ options: -O3 -lrt -lm
2 x 16GB DDR5-6000 CL30 F5-6000J3038F16G Processor: AMD Ryzen 9 9950X 16-Core @ 5.75GHz (16 Cores / 32 Threads), Motherboard: ASUS ROG STRIX X670E-E GAMING WIFI (2204 BIOS), Chipset: AMD Device 14d8, Memory: 2 x 16GB DDR5-6000MT/s G Skill F5-6000J3038F16G, Disk: 2000GB Corsair MP700 PRO, Graphics: AMD Radeon RX 7900 GRE 16GB, Audio: AMD Navi 31 HDMI/DP, Monitor: DELL U2723QE, Network: Intel I225-V + Intel Wi-Fi 6E
OS: Ubuntu 24.04, Kernel: 6.10.0-phx (x86_64), Desktop: GNOME Shell 46.0, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 24.2~git2406040600.8112d4~oibaf~n (git-8112d44 2024-06-04 noble-oibaf-ppa) (LLVM 17.0.6 DRM 3.57), Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 3840x2160
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xb40401aPython Notes: Python 3.12.3Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 14 August 2024 14:50 by user phoronix.