2 x AMD EPYC 7742 64-Core testing with a Supermicro H11DSi-NT v2.00 (2.1 BIOS) and ASPEED on Ubuntu 20.04 via the Phoronix Test Suite.
EPYC 7742 2P Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vDisk Notes: NONE / errors=remount-ro,relatime,rw / Block Size: 4096Processor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0x8301034Java Notes: OpenJDK Runtime Environment (build 11.0.10+9-Ubuntu-0ubuntu1.20.04)Python Notes: Python 2.7.18 + Python 3.8.5Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
2P Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0x8301034Python Notes: Python 2.7.18 + Python 3.8.5Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
2 x AMD EPYC 7742 64-Core Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0x8301034Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
7742 2P Repeat Processor: 2 x AMD EPYC 7742 64-Core @ 2.25GHz (128 Cores / 256 Threads), Motherboard: Supermicro H11DSi-NT v2.00 (2.1 BIOS), Chipset: AMD Starship/Matisse, Memory: 16 x 8192 MB DDR4-3200MT/s HMA81GR7CJR8N-XN, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP, Graphics: ASPEED, Monitor: VGA HDMI, Network: 2 x Intel 10G X550T
OS: Ubuntu 20.04, Kernel: 5.8.0-44-generic (x86_64), Display Server: X Server 1.20.8, Compiler: GCC 9.3.0, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vDisk Notes: NONE / errors=remount-ro,relatime,rw / Block Size: 4096Processor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0x8301034Java Notes: OpenJDK Runtime Environment (build 11.0.10+9-Ubuntu-0ubuntu1.20.04)Python Notes: Python 2.7.18 + Python 3.8.5Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
IOR IOR is a parallel I/O storage benchmark making use of MPI with a particular focus on HPC (High Performance Computing) systems. IOR is developed at the Lawrence Livermore National Laboratory (LLNL). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 2MB - Disk Target: Default Test Directory EPYC 7742 2P 7742 2P Repeat 100 200 300 400 500 SE +/- 1.26, N = 3 SE +/- 4.26, N = 7 445.10 452.03 MIN: 379.87 / MAX: 836.63 MIN: 378.75 / MAX: 1032.92 1. (CC) gcc options: -O2 -lm -pthread -lmpi
OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 4MB - Disk Target: Default Test Directory EPYC 7742 2P 7742 2P Repeat 110 220 330 440 550 SE +/- 6.27, N = 3 SE +/- 2.94, N = 3 485.74 480.55 MIN: 400.81 / MAX: 1055.7 MIN: 405.25 / MAX: 1028.47 1. (CC) gcc options: -O2 -lm -pthread -lmpi
OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 8MB - Disk Target: Default Test Directory EPYC 7742 2P 7742 2P Repeat 110 220 330 440 550 SE +/- 1.12, N = 3 SE +/- 3.76, N = 3 489.40 485.13 MIN: 410.57 / MAX: 920.3 MIN: 411.08 / MAX: 1041.22 1. (CC) gcc options: -O2 -lm -pthread -lmpi
OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 16MB - Disk Target: Default Test Directory EPYC 7742 2P 7742 2P Repeat 100 200 300 400 500 SE +/- 1.21, N = 3 SE +/- 5.31, N = 3 480.63 468.11 MIN: 410.37 / MAX: 1034.82 MIN: 411 / MAX: 1031.92 1. (CC) gcc options: -O2 -lm -pthread -lmpi
QuantLib QuantLib is an open-source library/framework around quantitative finance for modeling, trading and risk management scenarios. QuantLib is written in C++ with Boost and its built-in benchmark used reports the QuantLib Benchmark Index benchmark score. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.21 EPYC 7742 2P 7742 2P Repeat 400 800 1200 1600 2000 SE +/- 14.66, N = 3 SE +/- 13.17, N = 3 2015.9 2009.2 1. (CXX) g++ options: -O3 -march=native -rdynamic
Etcpak Etcpack is the self-proclaimed "fastest ETC compressor on the planet" with focused on providing open-source, very fast ETC and S3 texture compression support. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: DXT1 EPYC 7742 2P 7742 2P Repeat 200 400 600 800 1000 SE +/- 3.71, N = 3 SE +/- 3.65, N = 3 1039.81 1037.46 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC1 EPYC 7742 2P 7742 2P Repeat 50 100 150 200 250 SE +/- 0.10, N = 3 SE +/- 0.11, N = 3 236.72 237.25 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC2 EPYC 7742 2P 7742 2P Repeat 30 60 90 120 150 SE +/- 0.00, N = 3 SE +/- 0.07, N = 3 139.50 139.56 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC1 + Dithering EPYC 7742 2P 7742 2P Repeat 50 100 150 200 250 SE +/- 0.05, N = 3 SE +/- 0.03, N = 3 224.16 224.60 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
HPL Linpack HPL is a well known portable Linpack implementation for distributed memory systems. This test profile is testing HPL upstream directly, outside the scope of the HPC Challenge test profile also available through the Phoronix Test Suite (hpcc). The test profile attempts to generate an optimized HPL.dat input file based on the CPU/memory under test. The automated HPL.dat input generation is still being tuned and thus for now this test profile remains "experimental". Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFLOPS, More Is Better HPL Linpack 2.3 EPYC 7742 2P 30 60 90 120 150 SE +/- 0.42, N = 3 153.59 1. (CC) gcc options: -O2 -lopenblas -lm -pthread -lmpi
NAS Parallel Benchmarks NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: CG.C EPYC 7742 2P 7742 2P Repeat 9K 18K 27K 36K 45K SE +/- 380.32, N = 3 SE +/- 428.23, N = 5 41060.19 39810.96 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. Open MPI 4.0.3
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.C EPYC 7742 2P 7742 2P Repeat 2K 4K 6K 8K 10K SE +/- 21.18, N = 3 SE +/- 9.74, N = 3 8223.34 8108.43 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. Open MPI 4.0.3
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.D EPYC 7742 2P 7742 2P Repeat 2K 4K 6K 8K 10K SE +/- 14.28, N = 3 SE +/- 8.01, N = 3 8426.04 7885.52 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. Open MPI 4.0.3
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: FT.C EPYC 7742 2P 7742 2P Repeat 16K 32K 48K 64K 80K SE +/- 839.98, N = 4 SE +/- 262.47, N = 3 76051.36 71133.34 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. Open MPI 4.0.3
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: IS.D EPYC 7742 2P 7742 2P Repeat 700 1400 2100 2800 3500 SE +/- 30.48, N = 3 SE +/- 10.37, N = 3 3268.82 3313.12 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. Open MPI 4.0.3
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: LU.C EPYC 7742 2P 7742 2P Repeat 40K 80K 120K 160K 200K SE +/- 770.77, N = 3 SE +/- 1600.26, N = 3 194294.65 176465.67 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. Open MPI 4.0.3
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: MG.C EPYC 7742 2P 7742 2P Repeat 16K 32K 48K 64K 80K SE +/- 474.90, N = 3 SE +/- 149.89, N = 3 72806.59 73254.92 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. Open MPI 4.0.3
LeelaChessZero LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.26 Backend: BLAS EPYC 7742 2P 7742 2P Repeat 800 1600 2400 3200 4000 SE +/- 49.55, N = 9 SE +/- 37.92, N = 4 3936 3333 1. (CXX) g++ options: -flto -pthread
OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.26 Backend: Eigen EPYC 7742 2P 7742 2P Repeat 900 1800 2700 3600 4500 SE +/- 41.70, N = 3 SE +/- 49.56, N = 9 4198 3512 1. (CXX) g++ options: -flto -pthread
Parboil The Parboil Benchmarks from the IMPACT Research Group at University of Illinois are a set of throughput computing applications for looking at computing architecture and compilers. Parboil test-cases support OpenMP, OpenCL, and CUDA multi-processing environments. However, at this time the test profile is just making use of the OpenMP and OpenCL test workloads. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Parboil 2.5 Test: OpenMP LBM EPYC 7742 2P 7742 2P Repeat 16 32 48 64 80 SE +/- 1.30, N = 15 SE +/- 0.77, N = 3 51.02 74.07 1. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp
OpenBenchmarking.org Seconds, Fewer Is Better Parboil 2.5 Test: OpenMP CUTCP EPYC 7742 2P 7742 2P Repeat 0.1918 0.3836 0.5754 0.7672 0.959 SE +/- 0.009850, N = 3 SE +/- 0.008717, N = 15 0.831875 0.852529 1. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp
OpenBenchmarking.org Seconds, Fewer Is Better Parboil 2.5 Test: OpenMP Stencil EPYC 7742 2P 7742 2P Repeat 1.2475 2.495 3.7425 4.99 6.2375 SE +/- 0.018548, N = 3 SE +/- 0.042058, N = 3 5.389561 5.544628 1. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp
OpenBenchmarking.org Seconds, Fewer Is Better Parboil 2.5 Test: OpenMP MRI Gridding EPYC 7742 2P 7742 2P Repeat 50 100 150 200 250 SE +/- 1.09, N = 3 SE +/- 1.80, N = 3 194.08 208.04 1. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp
CloverLeaf CloverLeaf is a Lagrangian-Eulerian hydrodynamics benchmark. This test profile currently makes use of CloverLeaf's OpenMP version and benchmarked with the clover_bm.in input file (Problem 5). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf Lagrangian-Eulerian Hydrodynamics EPYC 7742 2P 7742 2P Repeat 600 1200 1800 2400 3000 SE +/- 0.38, N = 15 SE +/- 0.03, N = 3 23.36 2994.27 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
Rodinia Rodinia is a suite focused upon accelerating compute-intensive applications with accelerators. CUDA, OpenMP, and OpenCL parallel models are supported by the included applications. This profile utilizes select OpenCL, NVIDIA CUDA and OpenMP test binaries at the moment. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenMP LavaMD EPYC 7742 2P 7742 2P Repeat 8 16 24 32 40 SE +/- 0.27, N = 3 SE +/- 0.11, N = 3 30.21 32.51 1. (CXX) g++ options: -O2 -lOpenCL
OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenMP HotSpot3D EPYC 7742 2P 7742 2P Repeat 30 60 90 120 150 SE +/- 1.17, N = 15 SE +/- 0.67, N = 3 112.83 150.97 1. (CXX) g++ options: -O2 -lOpenCL
OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenMP Leukocyte EPYC 7742 2P 7742 2P Repeat 13 26 39 52 65 SE +/- 0.49, N = 3 SE +/- 0.43, N = 15 50.85 55.86 1. (CXX) g++ options: -O2 -lOpenCL
OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenMP CFD Solver EPYC 7742 2P 7742 2P Repeat 50 100 150 200 250 SE +/- 0.12, N = 3 SE +/- 0.14, N = 3 10.63 219.38 1. (CXX) g++ options: -O2 -lOpenCL
OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenMP Streamcluster EPYC 7742 2P 7742 2P Repeat 20 40 60 80 100 SE +/- 0.138, N = 15 SE +/- 4.228, N = 12 9.970 96.160 1. (CXX) g++ options: -O2 -lOpenCL
NAMD NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD was developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org days/ns, Fewer Is Better NAMD 2.14 ATPase Simulation - 327,506 Atoms EPYC 7742 2P 7742 2P Repeat 0.0637 0.1274 0.1911 0.2548 0.3185 SE +/- 0.00196, N = 3 SE +/- 0.00247, N = 3 0.27952 0.28306
Dolfyn Dolfyn is a Computational Fluid Dynamics (CFD) code of modern numerical simulation techniques. The Dolfyn test profile measures the execution time of the bundled computational fluid dynamics demos that are bundled with Dolfyn. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Dolfyn 0.527 Computational Fluid Dynamics EPYC 7742 2P 7742 2P Repeat 5 10 15 20 25 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 20.21 20.20
OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: XZ 0 - Process: Decompression EPYC 7742 2P 7742 2P Repeat 20 40 60 80 100 SE +/- 0.33, N = 3 98 99 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Zstd 1 - Process: Compression EPYC 7742 2P 7742 2P Repeat 90 180 270 360 450 SE +/- 0.88, N = 3 SE +/- 0.67, N = 3 435 437 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Zstd 1 - Process: Decompression EPYC 7742 2P 7742 2P Repeat 300 600 900 1200 1500 SE +/- 1.45, N = 3 SE +/- 4.36, N = 3 1346 1334 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Zstd 8 - Process: Compression EPYC 7742 2P 7742 2P Repeat 20 40 60 80 100 SE +/- 0.67, N = 3 82 83 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Zstd 8 - Process: Decompression EPYC 7742 2P 7742 2P Repeat 300 600 900 1200 1500 SE +/- 0.67, N = 3 SE +/- 0.88, N = 3 1497 1487 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Crush 0 - Process: Compression EPYC 7742 2P 7742 2P Repeat 20 40 60 80 100 SE +/- 0.67, N = 3 89 89 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Crush 0 - Process: Decompression EPYC 7742 2P 7742 2P Repeat 80 160 240 320 400 381 382 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Brotli 0 - Process: Compression EPYC 7742 2P 7742 2P Repeat 90 180 270 360 450 SE +/- 1.53, N = 3 SE +/- 0.67, N = 3 415 416 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Brotli 0 - Process: Decompression EPYC 7742 2P 7742 2P Repeat 100 200 300 400 500 SE +/- 1.53, N = 3 SE +/- 1.86, N = 3 482 481 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Brotli 2 - Process: Compression EPYC 7742 2P 7742 2P Repeat 40 80 120 160 200 SE +/- 0.58, N = 3 SE +/- 0.67, N = 3 165 165 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Brotli 2 - Process: Decompression EPYC 7742 2P 7742 2P Repeat 120 240 360 480 600 SE +/- 1.67, N = 3 571 569 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Libdeflate 1 - Process: Compression EPYC 7742 2P 7742 2P Repeat 50 100 150 200 250 SE +/- 0.33, N = 3 205 206 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Libdeflate 1 - Process: Decompression EPYC 7742 2P 7742 2P Repeat 200 400 600 800 1000 SE +/- 5.51, N = 3 SE +/- 0.67, N = 3 959 961 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
Algebraic Multi-Grid Benchmark AMG is a parallel algebraic multigrid solver for linear systems arising from problems on unstructured grids. The driver provided with AMG builds linear systems for various 3-dimensional problems. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Figure Of Merit, More Is Better Algebraic Multi-Grid Benchmark 1.2 EPYC 7742 2P 7742 2P Repeat 300M 600M 900M 1200M 1500M SE +/- 1663713.95, N = 3 SE +/- 737641.36, N = 3 1247427667 1246138667 1. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -pthread -lmpi
FFTE FFTE is a package by Daisuke Takahashi to compute Discrete Fourier Transforms of 1-, 2- and 3- dimensional sequences of length (2^p)*(3^q)*(5^r). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MFLOPS, More Is Better FFTE 7.0 N=256, 3D Complex FFT Routine EPYC 7742 2P 7742 2P Repeat 30K 60K 90K 120K 150K SE +/- 3136.68, N = 12 SE +/- 3906.26, N = 15 150712.97 148613.89 1. (F9X) gfortran options: -O3 -fomit-frame-pointer -fopenmp
FFTW FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 4096 EPYC 7742 2P 7742 2P Repeat 1500 3000 4500 6000 7500 SE +/- 6.26, N = 3 SE +/- 9.28, N = 3 6873.9 6890.6 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 2048 EPYC 7742 2P 7742 2P Repeat 1300 2600 3900 5200 6500 SE +/- 3.97, N = 3 SE +/- 47.11, N = 3 6160.7 6086.5 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 4096 EPYC 7742 2P 7742 2P Repeat 1200 2400 3600 4800 6000 SE +/- 13.54, N = 3 SE +/- 54.29, N = 3 5387.5 5306.8 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 4096 EPYC 7742 2P 7742 2P Repeat 10K 20K 30K 40K 50K SE +/- 343.27, N = 3 SE +/- 125.07, N = 3 44939 44294 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 2048 EPYC 7742 2P 7742 2P Repeat 6K 12K 18K 24K 30K SE +/- 200.43, N = 12 SE +/- 139.43, N = 3 26509 26795 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 4096 EPYC 7742 2P 7742 2P Repeat 4K 8K 12K 16K 20K SE +/- 230.17, N = 3 18663 17541 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
OpenBenchmarking.org Hydro Cycle Time - Seconds, Fewer Is Better Pennant 1.0.1 Test: leblancbig EPYC 7742 2P 7742 2P Repeat 0.8886 1.7772 2.6658 3.5544 4.443 SE +/- 0.028494, N = 3 SE +/- 0.047128, N = 3 3.949192 3.926446 1. (CXX) g++ options: -fopenmp -pthread -lmpi_cxx -lmpi
Timed MrBayes Analysis This test performs a bayesian analysis of a set of primate genome sequences in order to estimate their phylogeny. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Timed MrBayes Analysis 3.2.7 Primate Phylogeny Analysis EPYC 7742 2P 7742 2P Repeat 20 40 60 80 100 SE +/- 0.20, N = 3 SE +/- 0.21, N = 3 108.67 109.16 1. (CC) gcc options: -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -msha -maes -mavx -mfma -mavx2 -mrdrnd -mbmi -mbmi2 -madx -mabm -O3 -std=c99 -pedantic -lm -lreadline
NWChem NWChem is an open-source high performance computational chemistry package. Per NWChem's documentation, "NWChem aims to provide its users with computational chemistry tools that are scalable both in their ability to treat large scientific computational chemistry problems efficiently, and in their use of available parallel computing resources from high-performance parallel supercomputers to conventional workstation clusters." Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better NWChem 7.0.2 Input: C240 Buckyball EPYC 7742 2P 7742 2P Repeat 400 800 1200 1600 2000 1963.3 1933.8 1. (F9X) gfortran options: -lnwctask -lccsd -lmcscf -lselci -lmp2 -lmoints -lstepper -ldriver -loptim -lnwdft -lgradients -lcphf -lesp -lddscf -ldangchang -lguess -lhessian -lvib -lnwcutil -lrimp2 -lproperty -lsolvation -lnwints -lprepar -lnwmd -lnwpw -lofpw -lpaw -lpspw -lband -lnwpwlib -lcafe -lspace -lanalyze -lqhop -lpfft -ldplot -ldrdy -lvscf -lqmmm -lqmd -letrans -ltce -lbq -lmm -lcons -lperfm -ldntmc -lccca -ldimqm -lga -larmci -lpeigs -l64to32 -lopenblas -lpthread -lrt -llapack -lnwcblas -lmpi_usempif08 -lmpi_mpifh -lmpi -lcomex -lm -m64 -ffast-math -std=legacy -fdefault-integer-8 -finline-functions -O2
QMCPACK QMCPACK is a modern high-performance open-source Quantum Monte Carlo (QMC) simulation code making use of MPI for this benchmark of the H20 example code. QMCPACK is an open-source production level many-body ab initio Quantum Monte Carlo code for computing the electronic structure of atoms, molecules, and solids. QMCPACK is supported by the U.S. Department of Energy. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 3.10 Input: simple-H2O EPYC 7742 2P 7742 2P Repeat 10 20 30 40 50 SE +/- 0.29, N = 3 SE +/- 1.61, N = 15 44.32 46.28 1. (CXX) g++ options: -fopenmp -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -march=native -O3 -fomit-frame-pointer -ffast-math -pthread -lm
Incompact3D Incompact3d is a Fortran-MPI based, finite difference high-performance code for solving the incompressible Navier-Stokes equation and as many as you need scalar transport equations. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Incompact3D 2020-09-17 Input: Cylinder EPYC 7742 2P 7742 2P Repeat 80 160 240 320 400 SE +/- 0.88, N = 3 SE +/- 1.44, N = 3 345.79 348.52 1. (F9X) gfortran options: -cpp -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
Monte Carlo Simulations of Ionised Nebulae Mocassin is the Monte Carlo Simulations of Ionised Nebulae. MOCASSIN is a fully 3D or 2D photoionisation and dust radiative transfer code which employs a Monte Carlo approach to the transfer of radiation through media of arbitrary geometry and density distribution. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Monte Carlo Simulations of Ionised Nebulae 2019-03-24 Input: Dust 2D tau100.0 EPYC 7742 2P 7742 2P Repeat 50 100 150 200 250 SE +/- 0.33, N = 3 239 239 1. (F9X) gfortran options: -cpp -Jsource/ -ffree-line-length-0 -lm -std=legacy -O3 -O2 -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
OpenFOAM OpenFOAM is the leading free, open source software for computational fluid dynamics (CFD). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 8 Input: Motorbike 30M EPYC 7742 2P 7742 2P Repeat 4 8 12 16 20 SE +/- 0.07, N = 3 SE +/- 0.08, N = 3 14.12 14.15 1. (CXX) g++ options: -std=c++11 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -ldynamicMesh -ldecompose -lgenericPatchFields -lmetisDecomp -lscotchDecomp -llagrangian -lregionModels -lOpenFOAM -ldl -lm
OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 8 Input: Motorbike 60M EPYC 7742 2P 7742 2P Repeat 30 60 90 120 150 SE +/- 0.06, N = 3 SE +/- 0.16, N = 3 112.80 112.70 1. (CXX) g++ options: -std=c++11 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -ldynamicMesh -ldecompose -lgenericPatchFields -lmetisDecomp -lscotchDecomp -llagrangian -lregionModels -lOpenFOAM -ldl -lm
Quantum ESPRESSO Quantum ESPRESSO is an integrated suite of Open-Source computer codes for electronic-structure calculations and materials modeling at the nanoscale. It is based on density-functional theory, plane waves, and pseudopotentials. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Quantum ESPRESSO 6.7 Input: AUSURF112 EPYC 7742 2P 7742 2P Repeat 300 600 900 1200 1500 SE +/- 4.15, N = 3 SE +/- 3.23, N = 3 1219.36 1225.26 1. (F9X) gfortran options: -lopenblas -lFoX_dom -lFoX_sax -lFoX_wxml -lFoX_common -lFoX_utils -lFoX_fsys -lfftw3 -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
RELION RELION - REgularised LIkelihood OptimisatioN - is a stand-alone computer program for Maximum A Posteriori refinement of (multiple) 3D reconstructions or 2D class averages in cryo-electron microscopy (cryo-EM). It is developed in the research group of Sjors Scheres at the MRC Laboratory of Molecular Biology. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better RELION 3.1.1 Test: Basic - Device: CPU EPYC 7742 2P 7742 2P Repeat 120 240 360 480 600 SE +/- 4.40, N = 3 SE +/- 5.12, N = 6 542.05 541.47 1. (CXX) g++ options: -fopenmp -std=c++0x -O3 -rdynamic -ldl -ltiff -lfftw3f -lfftw3 -lpng -pthread -lmpi_cxx -lmpi
WebP Image Encode This is a test of Google's libwebp with the cwebp image encode utility and using a sample 6000x4000 pixel JPEG image as the input. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Default EPYC 7742 2P 7742 2P Repeat 0.4174 0.8348 1.2522 1.6696 2.087 SE +/- 0.003, N = 3 SE +/- 0.002, N = 3 1.855 1.854 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100 EPYC 7742 2P 7742 2P Repeat 0.6444 1.2888 1.9332 2.5776 3.222 SE +/- 0.001, N = 3 SE +/- 0.003, N = 3 2.863 2.864 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless EPYC 7742 2P 7742 2P Repeat 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 20.36 20.43 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Highest Compression EPYC 7742 2P 7742 2P Repeat 2 4 6 8 10 SE +/- 0.002, N = 3 SE +/- 0.002, N = 3 8.904 8.864 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless, Highest Compression EPYC 7742 2P 7742 2P Repeat 10 20 30 40 50 SE +/- 0.05, N = 3 SE +/- 0.03, N = 3 41.98 41.94 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Decompression Speed EPYC 7742 2P 7742 2P Repeat 2K 4K 6K 8K 10K SE +/- 91.80, N = 3 SE +/- 21.34, N = 3 11001.5 10868.7 1. (CC) gcc options: -O3
OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Compression Speed EPYC 7742 2P 7742 2P Repeat 10 20 30 40 50 SE +/- 0.39, N = 3 SE +/- 0.26, N = 3 45.36 45.16 1. (CC) gcc options: -O3
OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Decompression Speed EPYC 7742 2P 7742 2P Repeat 2K 4K 6K 8K 10K SE +/- 80.01, N = 3 SE +/- 28.45, N = 3 10188.3 10373.0 1. (CC) gcc options: -O3
OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Compression Speed EPYC 7742 2P 7742 2P Repeat 10 20 30 40 50 SE +/- 0.35, N = 3 SE +/- 0.03, N = 3 44.86 43.49 1. (CC) gcc options: -O3
OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Decompression Speed EPYC 7742 2P 7742 2P Repeat 2K 4K 6K 8K 10K SE +/- 44.54, N = 3 SE +/- 45.56, N = 3 10418.3 10278.8 1. (CC) gcc options: -O3
Zstd Compression This test measures the time needed to compress/decompress a sample file (a FreeBSD disk image - FreeBSD-12.2-RELEASE-amd64-memstick.img) using Zstd compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.9 Compression Level: 8 - Compression Speed EPYC 7742 2P 2 x AMD EPYC 7742 64-Core 7742 2P Repeat 500 1000 1500 2000 2500 SE +/- 78.60, N = 12 SE +/- 63.80, N = 15 SE +/- 73.14, N = 15 1989.9 2243.5 1992.5 1. (CC) gcc options: -O3 -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.9 Compression Level: 8 - Decompression Speed EPYC 7742 2P 2 x AMD EPYC 7742 64-Core 7742 2P Repeat 600 1200 1800 2400 3000 SE +/- 3.84, N = 11 SE +/- 2.98, N = 15 SE +/- 3.43, N = 15 2975.7 2982.6 2977.7 1. (CC) gcc options: -O3 -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.9 Compression Level: 19 - Compression Speed EPYC 7742 2P 2 x AMD EPYC 7742 64-Core 7742 2P Repeat 16 32 48 64 80 SE +/- 0.71, N = 3 SE +/- 1.15, N = 15 SE +/- 1.10, N = 15 70.7 69.2 70.9 1. (CC) gcc options: -O3 -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.9 Compression Level: 19 - Decompression Speed EPYC 7742 2P 2 x AMD EPYC 7742 64-Core 7742 2P Repeat 600 1200 1800 2400 3000 SE +/- 6.97, N = 3 SE +/- 2.65, N = 15 SE +/- 2.86, N = 15 2792.5 2781.5 2791.9 1. (CC) gcc options: -O3 -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.9 Compression Level: 3, Long Mode - Compression Speed EPYC 7742 2P 7742 2P Repeat 140 280 420 560 700 SE +/- 13.86, N = 15 SE +/- 14.50, N = 13 629.2 620.5 1. (CC) gcc options: -O3 -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.9 Compression Level: 3, Long Mode - Decompression Speed EPYC 7742 2P 7742 2P Repeat 700 1400 2100 2800 3500 SE +/- 2.93, N = 15 SE +/- 2.35, N = 13 3092.0 3090.0 1. (CC) gcc options: -O3 -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.9 Compression Level: 8, Long Mode - Compression Speed EPYC 7742 2P 2 x AMD EPYC 7742 64-Core 7742 2P Repeat 130 260 390 520 650 SE +/- 10.51, N = 15 SE +/- 4.02, N = 3 SE +/- 6.00, N = 3 587.6 561.6 566.1 1. (CC) gcc options: -O3 -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.9 Compression Level: 8, Long Mode - Decompression Speed EPYC 7742 2P 2 x AMD EPYC 7742 64-Core 7742 2P Repeat 700 1400 2100 2800 3500 SE +/- 3.61, N = 15 SE +/- 9.37, N = 3 SE +/- 10.95, N = 3 3205.6 3199.6 3205.8 1. (CC) gcc options: -O3 -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.9 Compression Level: 19, Long Mode - Compression Speed EPYC 7742 2P 2 x AMD EPYC 7742 64-Core 7742 2P Repeat 8 16 24 32 40 SE +/- 0.58, N = 15 SE +/- 0.52, N = 12 SE +/- 0.66, N = 12 32.8 33.0 33.9 1. (CC) gcc options: -O3 -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.9 Compression Level: 19, Long Mode - Decompression Speed EPYC 7742 2P 2 x AMD EPYC 7742 64-Core 7742 2P Repeat 600 1200 1800 2400 3000 SE +/- 3.12, N = 15 SE +/- 2.77, N = 12 SE +/- 4.52, N = 12 2828.5 2824.9 2825.8 1. (CC) gcc options: -O3 -pthread -lz -llzma
JPEG XL The JPEG XL Image Coding System is designed to provide next-generation JPEG image capabilities with JPEG XL offering better image quality and compression over legacy JPEG. This test profile is currently focused on the multi-threaded JPEG XL image encode performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better JPEG XL 0.3.1 Input: PNG - Encode Speed: 5 EPYC 7742 2P 7742 2P Repeat 14 28 42 56 70 SE +/- 0.58, N = 15 SE +/- 0.52, N = 15 63.73 64.40 1. (CXX) g++ options: -funwind-tables -O3 -O2 -fPIE -pie -pthread -ldl
OpenBenchmarking.org MP/s, More Is Better JPEG XL 0.3.1 Input: PNG - Encode Speed: 7 EPYC 7742 2P 7742 2P Repeat 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 9.77 9.77 1. (CXX) g++ options: -funwind-tables -O3 -O2 -fPIE -pie -pthread -ldl
OpenBenchmarking.org MP/s, More Is Better JPEG XL 0.3.1 Input: PNG - Encode Speed: 8 EPYC 7742 2P 7742 2P Repeat 0.1575 0.315 0.4725 0.63 0.7875 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.70 0.70 1. (CXX) g++ options: -funwind-tables -O3 -O2 -fPIE -pie -pthread -ldl
OpenBenchmarking.org MP/s, More Is Better JPEG XL 0.3.1 Input: JPEG - Encode Speed: 5 EPYC 7742 2P 7742 2P Repeat 12 24 36 48 60 SE +/- 0.39, N = 11 SE +/- 0.58, N = 3 51.71 53.11 1. (CXX) g++ options: -funwind-tables -O3 -O2 -fPIE -pie -pthread -ldl
OpenBenchmarking.org MP/s, More Is Better JPEG XL 0.3.1 Input: JPEG - Encode Speed: 7 EPYC 7742 2P 7742 2P Repeat 12 24 36 48 60 SE +/- 0.17, N = 3 SE +/- 0.33, N = 15 51.36 51.17 1. (CXX) g++ options: -funwind-tables -O3 -O2 -fPIE -pie -pthread -ldl
OpenBenchmarking.org MP/s, More Is Better JPEG XL 0.3.1 Input: JPEG - Encode Speed: 8 EPYC 7742 2P 7742 2P Repeat 5 10 15 20 25 SE +/- 0.17, N = 15 SE +/- 0.26, N = 15 22.91 22.72 1. (CXX) g++ options: -funwind-tables -O3 -O2 -fPIE -pie -pthread -ldl
JPEG XL Decoding The JPEG XL Image Coding System is designed to provide next-generation JPEG image capabilities with JPEG XL offering better image quality and compression over legacy JPEG. This test profile is suited for JPEG XL decode performance testing to PNG output file, the pts/jpexl test is for encode performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better JPEG XL Decoding 0.3.1 CPU Threads: All EPYC 7742 2P 7742 2P Repeat 20 40 60 80 100 SE +/- 0.77, N = 3 SE +/- 0.25, N = 3 99.54 99.32
srsLTE srsLTE is an open-source LTE software radio suite created by Software Radio Systems (SRS). srsLTE can be used for building your own software defined (SDR) LTE mobile network. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Samples / Second, More Is Better srsLTE 20.10.1 Test: OFDM_Test EPYC 7742 2P 2P 7742 2P Repeat 20M 40M 60M 80M 100M SE +/- 520683.31, N = 3 SE +/- 404145.19, N = 3 SE +/- 1084711.94, N = 5 98333333 101100000 97960000 1. (CXX) g++ options: -std=c++11 -fno-strict-aliasing -march=native -mfpmath=sse -mavx2 -fvisibility=hidden -O3 -fno-trapping-math -fno-math-errno -rdynamic -lpthread -lmbedcrypto -lconfig++ -lsctp -lbladeRF -lm -lfftw3f
OpenBenchmarking.org eNb Mb/s, More Is Better srsLTE 20.10.1 Test: PHY_DL_Test EPYC 7742 2P 2P 7742 2P Repeat 50 100 150 200 250 SE +/- 0.38, N = 3 SE +/- 0.19, N = 3 SE +/- 0.92, N = 3 197.7 207.6 204.5 1. (CXX) g++ options: -std=c++11 -fno-strict-aliasing -march=native -mfpmath=sse -mavx2 -fvisibility=hidden -O3 -fno-trapping-math -fno-math-errno -rdynamic -lpthread -lmbedcrypto -lconfig++ -lsctp -lbladeRF -lm -lfftw3f
OpenBenchmarking.org UE Mb/s, More Is Better srsLTE 20.10.1 Test: PHY_DL_Test EPYC 7742 2P 2P 7742 2P Repeat 20 40 60 80 100 SE +/- 0.33, N = 3 SE +/- 0.03, N = 3 SE +/- 0.21, N = 3 83.0 87.5 86.8 1. (CXX) g++ options: -std=c++11 -fno-strict-aliasing -march=native -mfpmath=sse -mavx2 -fvisibility=hidden -O3 -fno-trapping-math -fno-math-errno -rdynamic -lpthread -lmbedcrypto -lconfig++ -lsctp -lbladeRF -lm -lfftw3f
LuaJIT This test profile is a collection of Lua scripts/benchmarks run against a locally-built copy of LuaJIT upstream. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better LuaJIT 2.1-git Test: Composite EPYC 7742 2P 7742 2P Repeat 300 600 900 1200 1500 SE +/- 14.63, N = 4 SE +/- 9.48, N = 15 1178.95 1200.41 1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -U_FORTIFY_SOURCE -fno-stack-protector
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Swirl EPYC 7742 2P 7742 2P Repeat 400 800 1200 1600 2000 SE +/- 11.05, N = 3 SE +/- 14.38, N = 3 1721 1730 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Rotate EPYC 7742 2P 7742 2P Repeat 120 240 360 480 600 SE +/- 4.66, N = 8 SE +/- 4.25, N = 9 543 523 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Sharpen EPYC 7742 2P 7742 2P Repeat 200 400 600 800 1000 SE +/- 5.33, N = 3 833 829 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Enhanced EPYC 7742 2P 7742 2P Repeat 300 600 900 1200 1500 SE +/- 2.40, N = 3 SE +/- 0.88, N = 3 1199 1203 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Resizing EPYC 7742 2P 7742 2P Repeat 15 30 45 60 75 SE +/- 0.88, N = 3 SE +/- 0.88, N = 3 69 68 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Noise-Gaussian EPYC 7742 2P 7742 2P Repeat 140 280 420 560 700 SE +/- 6.04, N = 15 SE +/- 4.16, N = 3 650 654 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: HWB Color Space EPYC 7742 2P 7742 2P Repeat 200 400 600 800 1000 SE +/- 13.26, N = 15 SE +/- 4.06, N = 3 929 880 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU EPYC 7742 2P 7742 2P Repeat 0.4665 0.933 1.3995 1.866 2.3325 SE +/- 0.00986, N = 3 SE +/- 0.00722, N = 3 2.07347 2.06564 MIN: 1.86 MIN: 1.87 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU EPYC 7742 2P 7742 2P Repeat 0.5622 1.1244 1.6866 2.2488 2.811 SE +/- 0.02717, N = 15 SE +/- 0.13368, N = 12 1.54650 2.49866 MIN: 1.18 MIN: 1.59 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU EPYC 7742 2P 7742 2P Repeat 0.481 0.962 1.443 1.924 2.405 SE +/- 0.01302, N = 3 SE +/- 0.00603, N = 3 2.13329 2.13758 MIN: 1.93 MIN: 1.93 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU EPYC 7742 2P 7742 2P Repeat 0.6951 1.3902 2.0853 2.7804 3.4755 SE +/- 0.03597, N = 3 SE +/- 0.01047, N = 3 3.08917 2.88822 MIN: 2.76 MIN: 2.66 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU EPYC 7742 2P 7742 2P Repeat 0.4398 0.8796 1.3194 1.7592 2.199 SE +/- 0.008019, N = 3 SE +/- 0.027365, N = 3 0.724395 1.954610 MIN: 0.67 MIN: 1.85 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU EPYC 7742 2P 7742 2P Repeat 0.6456 1.2912 1.9368 2.5824 3.228 SE +/- 0.01849, N = 3 SE +/- 0.01034, N = 3 2.86937 2.86260 MIN: 2.62 MIN: 2.65 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU EPYC 7742 2P 7742 2P Repeat 0.649 1.298 1.947 2.596 3.245 SE +/- 0.04670, N = 12 SE +/- 0.00982, N = 3 2.88437 2.70898 MIN: 2.52 MIN: 2.55 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU EPYC 7742 2P 7742 2P Repeat 1.2579 2.5158 3.7737 5.0316 6.2895 SE +/- 0.15852, N = 13 SE +/- 0.12818, N = 15 4.68078 5.59055 MIN: 2.73 MIN: 4.05 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU EPYC 7742 2P 7742 2P Repeat 0.4962 0.9924 1.4886 1.9848 2.481 SE +/- 0.00840, N = 3 SE +/- 0.00319, N = 3 2.20539 2.19937 MIN: 2.03 MIN: 2.03 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU EPYC 7742 2P 7742 2P Repeat 0.2723 0.5446 0.8169 1.0892 1.3615 SE +/- 0.00941, N = 15 SE +/- 0.01177, N = 6 1.21032 1.20163 MIN: 1.07 MIN: 1.06 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU EPYC 7742 2P 7742 2P Repeat 700 1400 2100 2800 3500 SE +/- 36.76, N = 3 SE +/- 60.50, N = 15 2948.06 3125.87 MIN: 2599.44 MIN: 2374.92 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU EPYC 7742 2P 7742 2P Repeat 300 600 900 1200 1500 SE +/- 15.39, N = 15 SE +/- 34.94, N = 15 1267.20 1355.40 MIN: 1101.26 MIN: 1136.7 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU EPYC 7742 2P 7742 2P Repeat 700 1400 2100 2800 3500 SE +/- 63.06, N = 15 SE +/- 92.93, N = 12 2910.88 3152.16 MIN: 2232.96 MIN: 2386.06 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU EPYC 7742 2P 7742 2P Repeat 300 600 900 1200 1500 SE +/- 19.98, N = 15 SE +/- 14.75, N = 3 1281.86 1285.17 MIN: 1116.9 MIN: 1199.95 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU EPYC 7742 2P 7742 2P Repeat 0.1608 0.3216 0.4824 0.6432 0.804 SE +/- 0.003926, N = 3 SE +/- 0.008252, N = 3 0.712936 0.714842 MIN: 0.65 MIN: 0.65 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU EPYC 7742 2P 7742 2P Repeat 700 1400 2100 2800 3500 SE +/- 22.66, N = 3 SE +/- 91.94, N = 15 2923.30 3209.71 MIN: 2522.76 MIN: 2404.69 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU EPYC 7742 2P 300 600 900 1200 1500 SE +/- 17.20, N = 15 1245.81 MIN: 1111.62 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU EPYC 7742 2P 0.1829 0.3658 0.5487 0.7316 0.9145 SE +/- 0.002167, N = 3 0.812990 MIN: 0.76 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org FPS, More Is Better dav1d 0.8.2 Video Input: Summer Nature 1080p EPYC 7742 2P 300 600 900 1200 1500 SE +/- 12.08, N = 14 1245.91 MIN: 153.88 / MAX: 1618.67 1. (CC) gcc options: -pthread -lm
OSPray Intel OSPray is a portable ray-tracing engine for high-performance, high-fidenlity scientific visualizations. OSPray builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FPS, More Is Better OSPray 1.8.5 Demo: San Miguel - Renderer: SciVis EPYC 7742 2P 20 40 60 80 100 SE +/- 0.00, N = 3 83.33 MIN: 26.32 / MAX: 90.91
OpenBenchmarking.org FPS, More Is Better OSPray 1.8.5 Demo: Magnetic Reconnection - Renderer: SciVis EPYC 7742 2P 10 20 30 40 50 SE +/- 0.00, N = 3 45.45 MIN: 9.09 / MAX: 47.62
OpenBenchmarking.org FPS, More Is Better OSPray 1.8.5 Demo: NASA Streamlines - Renderer: Path Tracer EPYC 7742 2P 7 14 21 28 35 SE +/- 0.00, N = 3 30.30 MIN: 11.24 / MAX: 31.25
OpenBenchmarking.org FPS, More Is Better OSPray 1.8.5 Demo: Magnetic Reconnection - Renderer: Path Tracer EPYC 7742 2P 70 140 210 280 350 SE +/- 0.00, N = 3 333.33 MIN: 37.04 / MAX: 500
TTSIOD 3D Renderer A portable GPL 3D software renderer that supports OpenMP and Intel Threading Building Blocks with many different rendering modes. This version does not use OpenGL but is entirely CPU/software based. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FPS, More Is Better TTSIOD 3D Renderer 2.3b Phong Rendering With Soft-Shadow Mapping EPYC 7742 2P 130 260 390 520 650 SE +/- 9.68, N = 15 581.88 1. (CXX) g++ options: -O3 -fomit-frame-pointer -ffast-math -mtune=native -flto -msse -mrecip -mfpmath=sse -msse2 -mssse3 -lSDL -fopenmp -fwhole-program -lstdc++
OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.0 Encoder Mode: Speed 6 Two-Pass EPYC 7742 2P 0.756 1.512 2.268 3.024 3.78 SE +/- 0.01, N = 3 3.36 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.0 Encoder Mode: Speed 8 Realtime EPYC 7742 2P 7 14 21 28 35 SE +/- 0.37, N = 3 31.89 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer ISPC - Model: Crown EPYC 7742 2P 13 26 39 52 65 SE +/- 0.51, N = 3 59.33 MIN: 55.07 / MAX: 65.52
OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer - Model: Asian Dragon EPYC 7742 2P 10 20 30 40 50 SE +/- 0.32, N = 3 44.97 MIN: 41.97 / MAX: 48.41
OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer - Model: Asian Dragon Obj EPYC 7742 2P 9 18 27 36 45 SE +/- 0.11, N = 3 39.05 MIN: 37.1 / MAX: 42.47
OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer ISPC - Model: Asian Dragon EPYC 7742 2P 10 20 30 40 50 SE +/- 0.14, N = 3 42.12 MIN: 39.85 / MAX: 44.87
OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer ISPC - Model: Asian Dragon Obj EPYC 7742 2P 8 16 24 32 40 SE +/- 0.31, N = 3 36.33 MIN: 34.33 / MAX: 39.39
Kvazaar This is a test of Kvazaar as a CPU-based H.265 video encoder written in the C programming language and optimized in Assembly. Kvazaar is the winner of the 2016 ACM Open-Source Software Competition and developed at the Ultra Video Group, Tampere University, Finland. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Medium EPYC 7742 2P 5 10 15 20 25 SE +/- 0.19, N = 15 22.70 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Medium EPYC 7742 2P 14 28 42 56 70 SE +/- 0.46, N = 3 64.43 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Very Fast EPYC 7742 2P 9 18 27 36 45 SE +/- 0.92, N = 15 40.15 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Ultra Fast EPYC 7742 2P 10 20 30 40 50 SE +/- 1.77, N = 12 44.45 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Very Fast EPYC 7742 2P 30 60 90 120 150 SE +/- 0.43, N = 3 136.84 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast EPYC 7742 2P 40 80 120 160 200 SE +/- 2.71, N = 15 181.14 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
SVT-AV1 This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-AV1 CPU-based multi-threaded video encoder for the AV1 video format with a sample 1080p YUV video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 0.8 Encoder Mode: Enc Mode 4 - Input: 1080p EPYC 7742 2P 2 4 6 8 10 SE +/- 0.065, N = 3 7.456 1. (CXX) g++ options: -O3 -fcommon -fPIE -fPIC -pie
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 0.8 Encoder Mode: Enc Mode 8 - Input: 1080p EPYC 7742 2P 20 40 60 80 100 SE +/- 0.35, N = 3 85.79 1. (CXX) g++ options: -O3 -fcommon -fPIE -fPIC -pie
SVT-VP9 This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-VP9 CPU-based multi-threaded video encoder for the VP9 video format with a sample 1080p YUV video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.1 Tuning: VMAF Optimized - Input: Bosphorus 1080p EPYC 7742 2P 70 140 210 280 350 SE +/- 14.04, N = 12 340.02 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.1 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p EPYC 7742 2P 80 160 240 320 400 SE +/- 4.38, N = 4 363.96 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.1 Tuning: Visual Quality Optimized - Input: Bosphorus 1080p EPYC 7742 2P 60 120 180 240 300 SE +/- 0.09, N = 3 274.56 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
x264 This is a simple test of the x264 encoder run on the CPU (OpenCL support disabled) with a sample video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better x264 2019-12-17 H.264 Video Encoding EPYC 7742 2P 40 80 120 160 200 SE +/- 2.66, N = 15 204.01 1. (CC) gcc options: -ldl -lavformat -lavcodec -lavutil -lswscale -m64 -lm -lpthread -O3 -ffast-math -std=gnu99 -fPIC -fomit-frame-pointer -fno-tree-vectorize
OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 1080p EPYC 7742 2P 14 28 42 56 70 SE +/- 0.62, N = 15 61.36 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Stockfish This is a test of Stockfish, an advanced C++11 chess benchmark that can scale up to 128 CPU cores. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 12 Total Time EPYC 7742 2P 40M 80M 120M 160M 200M SE +/- 2342263.11, N = 3 190042987 1. (CXX) g++ options: -m64 -lpthread -fno-exceptions -std=c++17 -pedantic -O3 -msse -msse3 -mpopcnt -msse4.1 -mssse3 -msse2 -flto -flto=jobserver
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 2 EPYC 7742 2P 2 x AMD EPYC 7742 64-Core 8 16 24 32 40 SE +/- 0.28, N = 8 SE +/- 0.22, N = 15 32.72 32.70 1. (CXX) g++ options: -O3 -fPIC -lm
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 6 EPYC 7742 2P 2 x AMD EPYC 7742 64-Core 3 6 9 12 15 SE +/- 0.15, N = 3 SE +/- 0.26, N = 15 12.10 12.96 1. (CXX) g++ options: -O3 -fPIC -lm
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 10 EPYC 7742 2P 2 x AMD EPYC 7742 64-Core 0.9513 1.9026 2.8539 3.8052 4.7565 SE +/- 0.028, N = 3 SE +/- 0.048, N = 3 4.228 4.189 1. (CXX) g++ options: -O3 -fPIC -lm
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 6, Lossless EPYC 7742 2P 2 x AMD EPYC 7742 64-Core 8 16 24 32 40 SE +/- 0.07, N = 3 SE +/- 0.42, N = 3 34.93 35.43 1. (CXX) g++ options: -O3 -fPIC -lm
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 10, Lossless EPYC 7742 2P 2 x AMD EPYC 7742 64-Core 2 4 6 8 10 SE +/- 0.073, N = 15 SE +/- 0.037, N = 3 7.587 7.400 1. (CXX) g++ options: -O3 -fPIC -lm
C-Ray This is a test of C-Ray, a simple raytracer designed to test the floating-point CPU performance. This test is multi-threaded (16 threads per core), will shoot 8 rays per pixel for anti-aliasing, and will generate a 1600 x 1200 image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel EPYC 7742 2P 2 4 6 8 10 SE +/- 0.165, N = 12 7.754 1. (CC) gcc options: -lm -lpthread -O3
POV-Ray This is a test of POV-Ray, the Persistence of Vision Raytracer. POV-Ray is used to create 3D graphics using ray-tracing. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better POV-Ray 3.7.0.7 Trace Time EPYC 7742 2P 2 4 6 8 10 SE +/- 0.028, N = 3 8.028 1. (CXX) g++ options: -pipe -O3 -ffast-math -march=native -pthread -lSDL -lXpm -lSM -lICE -lX11 -lIlmImf -lImath -lHalf -lIex -lIexMath -lIlmThread -lpthread -ltiff -ljpeg -lpng -lz -lrt -lm -lboost_thread -lboost_system
Tungsten Renderer Tungsten is a C++ physically based renderer that makes use of Intel's Embree ray tracing library. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Tungsten Renderer 0.2.2 Scene: Hair EPYC 7742 2P 1.2638 2.5276 3.7914 5.0552 6.319 SE +/- 0.05055, N = 15 5.61696 1. (CXX) g++ options: -std=c++0x -march=znver1 -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -mfma -mbmi2 -mno-avx -mno-avx2 -mno-xop -mno-fma4 -mno-avx512f -mno-avx512vl -mno-avx512pf -mno-avx512er -mno-avx512cd -mno-avx512dq -mno-avx512bw -mno-avx512ifma -mno-avx512vbmi -fstrict-aliasing -O3 -rdynamic -lIlmImf -lIlmThread -lImath -lHalf -lIex -lz -ljpeg -lpthread -ldl
OpenBenchmarking.org Seconds, Fewer Is Better Tungsten Renderer 0.2.2 Scene: Water Caustic EPYC 7742 2P 6 12 18 24 30 SE +/- 0.32, N = 15 23.60 1. (CXX) g++ options: -std=c++0x -march=znver1 -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -mfma -mbmi2 -mno-avx -mno-avx2 -mno-xop -mno-fma4 -mno-avx512f -mno-avx512vl -mno-avx512pf -mno-avx512er -mno-avx512cd -mno-avx512dq -mno-avx512bw -mno-avx512ifma -mno-avx512vbmi -fstrict-aliasing -O3 -rdynamic -lIlmImf -lIlmThread -lImath -lHalf -lIex -lz -ljpeg -lpthread -ldl
OpenBenchmarking.org Seconds, Fewer Is Better Tungsten Renderer 0.2.2 Scene: Non-Exponential EPYC 7742 2P 0.3876 0.7752 1.1628 1.5504 1.938 SE +/- 0.03014, N = 15 1.72256 1. (CXX) g++ options: -std=c++0x -march=znver1 -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -mfma -mbmi2 -mno-avx -mno-avx2 -mno-xop -mno-fma4 -mno-avx512f -mno-avx512vl -mno-avx512pf -mno-avx512er -mno-avx512cd -mno-avx512dq -mno-avx512bw -mno-avx512ifma -mno-avx512vbmi -fstrict-aliasing -O3 -rdynamic -lIlmImf -lIlmThread -lImath -lHalf -lIex -lz -ljpeg -lpthread -ldl
OpenBenchmarking.org Seconds, Fewer Is Better Tungsten Renderer 0.2.2 Scene: Volumetric Caustic EPYC 7742 2P 1.0038 2.0076 3.0114 4.0152 5.019 SE +/- 0.00391, N = 3 4.46136 1. (CXX) g++ options: -std=c++0x -march=znver1 -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -mfma -mbmi2 -mno-avx -mno-avx2 -mno-xop -mno-fma4 -mno-avx512f -mno-avx512vl -mno-avx512pf -mno-avx512er -mno-avx512cd -mno-avx512dq -mno-avx512bw -mno-avx512ifma -mno-avx512vbmi -fstrict-aliasing -O3 -rdynamic -lIlmImf -lIlmThread -lImath -lHalf -lIex -lz -ljpeg -lpthread -ldl
Timed Wasmer Compilation This test times how long it takes to compile Wasmer. Wasmer is written in the Rust programming language and is a WebAssembly runtime implementation that supports WASI and EmScripten. This test profile builds Wasmer with the Cranelift and Singlepast compiler features enabled. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Timed Wasmer Compilation 1.0.2 Time To Compile EPYC 7742 2P 2 x AMD EPYC 7742 64-Core 15 30 45 60 75 SE +/- 0.38, N = 3 SE +/- 0.18, N = 3 68.22 68.43 1. (CC) gcc options: -m64 -pie -nodefaultlibs -ldl -lrt -lpthread -lgcc_s -lc -lm -lutil
Opus Codec Encoding Opus is an open audio codec. Opus is a lossy audio compression format designed primarily for interactive real-time applications over the Internet. This test uses Opus-Tools and measures the time required to encode a WAV file to Opus. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.3.1 WAV To Opus Encode EPYC 7742 2P 3 6 9 12 15 SE +/- 0.019, N = 5 9.150 1. (CXX) g++ options: -fvisibility=hidden -logg -lm
Ngspice Ngspice is an open-source SPICE circuit simulator. Ngspice was originally based on the Berkeley SPICE electronic circuit simulator. Ngspice supports basic threading using OpenMP. This test profile is making use of the ISCAS 85 benchmark circuits. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Ngspice 34 Circuit: C2670 EPYC 7742 2P 40 80 120 160 200 SE +/- 0.82, N = 3 169.46 1. (CC) gcc options: -O0 -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lSM -lICE
OpenBenchmarking.org Seconds, Fewer Is Better Ngspice 34 Circuit: C7552 EPYC 7742 2P 30 60 90 120 150 SE +/- 0.03, N = 3 130.44 1. (CC) gcc options: -O0 -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lSM -lICE
RNNoise RNNoise is a recurrent neural network for audio noise reduction developed by Mozilla and Xiph.Org. This test profile is a single-threaded test measuring the time to denoise a sample 26 minute long 16-bit RAW audio file using this recurrent neural network noise suppression library. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better RNNoise 2020-06-28 EPYC 7742 2P 6 12 18 24 30 SE +/- 0.04, N = 3 23.14 1. (CC) gcc options: -O2 -pedantic -fvisibility=hidden
WebP2 Image Encode This is a test of Google's libwebp2 library with the WebP2 image encode utility and using a sample 6000x4000 pixel JPEG image as the input, similar to the WebP/libwebp test profile. WebP2 is currently experimental and under heavy development as ultimately the successor to WebP. WebP2 supports 10-bit HDR, more efficienct lossy compression, improved lossless compression, animation support, and full multi-threading support compared to WebP. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better WebP2 Image Encode 20210126 Encode Settings: Default EPYC 7742 2P 0.7362 1.4724 2.2086 2.9448 3.681 SE +/- 0.035, N = 15 3.272 1. (CXX) g++ options: -msse4.2 -fno-rtti -O3 -rdynamic -lpthread -ljpeg -lgif
OpenBenchmarking.org Seconds, Fewer Is Better WebP2 Image Encode 20210126 Encode Settings: Quality 75, Compression Effort 7 EPYC 7742 2P 30 60 90 120 150 SE +/- 0.15, N = 3 136.41 1. (CXX) g++ options: -msse4.2 -fno-rtti -O3 -rdynamic -lpthread -ljpeg -lgif
OpenBenchmarking.org Seconds, Fewer Is Better WebP2 Image Encode 20210126 Encode Settings: Quality 95, Compression Effort 7 EPYC 7742 2P 50 100 150 200 250 SE +/- 0.06, N = 3 251.43 1. (CXX) g++ options: -msse4.2 -fno-rtti -O3 -rdynamic -lpthread -ljpeg -lgif
OpenBenchmarking.org Seconds, Fewer Is Better WebP2 Image Encode 20210126 Encode Settings: Quality 100, Compression Effort 5 EPYC 7742 2P 2 4 6 8 10 SE +/- 0.027, N = 3 7.721 1. (CXX) g++ options: -msse4.2 -fno-rtti -O3 -rdynamic -lpthread -ljpeg -lgif
OpenBenchmarking.org Seconds, Fewer Is Better WebP2 Image Encode 20210126 Encode Settings: Quality 100, Lossless Compression EPYC 7742 2P 100 200 300 400 500 SE +/- 0.44, N = 3 440.93 1. (CXX) g++ options: -msse4.2 -fno-rtti -O3 -rdynamic -lpthread -ljpeg -lgif
Google SynthMark SynthMark is a cross platform tool for benchmarking CPU performance under a variety of real-time audio workloads. It uses a polyphonic synthesizer model to provide standardized tests for latency, jitter and computational throughput. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Voices, More Is Better Google SynthMark 20201109 Test: VoiceMark_100 EPYC 7742 2P 140 280 420 560 700 SE +/- 0.61, N = 3 646.91 1. (CXX) g++ options: -lm -lpthread -std=c++11 -Ofast
Liquid-DSP LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 1 - Buffer Length: 256 - Filter Length: 57 EPYC 7742 2P 2P 11M 22M 33M 44M 55M SE +/- 22980.67, N = 3 SE +/- 10477.49, N = 3 53594667 53579667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 2 - Buffer Length: 256 - Filter Length: 57 EPYC 7742 2P 2P 20M 40M 60M 80M 100M SE +/- 61191.87, N = 3 SE +/- 18559.21, N = 3 107143333 107166667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 4 - Buffer Length: 256 - Filter Length: 57 EPYC 7742 2P 2P 50M 100M 150M 200M 250M SE +/- 80829.04, N = 3 SE +/- 107445.08, N = 3 213280000 213286667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 8 - Buffer Length: 256 - Filter Length: 57 EPYC 7742 2P 2P 90M 180M 270M 360M 450M SE +/- 101707.64, N = 3 SE +/- 127322.86, N = 3 427203333 427276667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 16 - Buffer Length: 256 - Filter Length: 57 EPYC 7742 2P 2P 200M 400M 600M 800M 1000M SE +/- 1082240.47, N = 3 SE +/- 620358.32, N = 3 832276667 831613333 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 32 - Buffer Length: 256 - Filter Length: 57 EPYC 7742 2P 2P 300M 600M 900M 1200M 1500M SE +/- 1822391.59, N = 3 SE +/- 1153256.26, N = 3 1616566667 1618000000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 64 - Buffer Length: 256 - Filter Length: 57 EPYC 7742 2P 2P 600M 1200M 1800M 2400M 3000M SE +/- 8434123.81, N = 3 SE +/- 16574813.56, N = 3 2703933333 2693766667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 128 - Buffer Length: 256 - Filter Length: 57 EPYC 7742 2P 2P 700M 1400M 2100M 2800M 3500M SE +/- 13159153.97, N = 3 SE +/- 79162966.07, N = 13 3135600000 3218138462 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 256 - Buffer Length: 256 - Filter Length: 57 EPYC 7742 2P 2P 1200M 2400M 3600M 4800M 6000M SE +/- 29512765.60, N = 3 SE +/- 16339556.64, N = 3 5525100000 5550733333 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
FinanceBench FinanceBench is a collection of financial program benchmarks with support for benchmarking on the GPU via OpenCL and CPU benchmarking with OpenMP. The FinanceBench test cases are focused on Black-Sholes-Merton Process with Analytic European Option engine, QMC (Sobol) Monte-Carlo method (Equity Option Example), Bonds Fixed-rate bond with flat forward curve, and Repo Securities repurchase agreement. FinanceBench was originally written by the Cavazos Lab at University of Delaware. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-07-25 Benchmark: Repo OpenMP EPYC 7742 2P 11K 22K 33K 44K 55K SE +/- 187.65, N = 3 52054.59 1. (CXX) g++ options: -O3 -march=native -fopenmp
ASKAP ASKAP is a set of benchmarks from the Australian SKA Pathfinder. The principal ASKAP benchmarks are the Hogbom Clean Benchmark (tHogbomClean) and Convolutional Resamping Benchmark (tConvolve) as well as some previous ASKAP benchmarks being included as well for OpenCL and CUDA execution of tConvolve. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve MT - Gridding EPYC 7742 2P 1100 2200 3300 4400 5500 SE +/- 48.52, N = 9 5224.98 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve MT - Degridding EPYC 7742 2P 1500 3000 4500 6000 7500 SE +/- 208.57, N = 9 7117.00 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Degridding EPYC 7742 2P 8K 16K 24K 32K 40K SE +/- 416.26, N = 3 38291.8 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Gridding EPYC 7742 2P 8K 16K 24K 32K 40K SE +/- 223.13, N = 3 37599.7 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve OpenMP - Gridding EPYC 7742 2P 1000 2000 3000 4000 5000 SE +/- 78.34, N = 12 4826.83 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve OpenMP - Degridding EPYC 7742 2P 900 1800 2700 3600 4500 SE +/- 55.12, N = 12 3991.99 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
OpenBenchmarking.org Iterations Per Second, More Is Better ASKAP 1.0 Test: Hogbom Clean OpenMP EPYC 7742 2P 50 100 150 200 250 SE +/- 2.90, N = 15 217.82 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
LuaRadio LuaRadio is a lightweight software-defined radio (SDR) framework built atop LuaJIT. LuaRadio provides a suite of source, sink, and processing blocks, with a simple API for defining flow graphs, running flow graphs, creating blocks, and creating data types. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MiB/s, More Is Better LuaRadio 0.9.1 Test: Five Back to Back FIR Filters 2P EPYC 7742 2P 7742 2P Repeat 140 280 420 560 700 SE +/- 7.57, N = 4 SE +/- 4.40, N = 3 SE +/- 3.58, N = 3 653.0 643.9 643.8
OpenBenchmarking.org MiB/s, More Is Better LuaRadio 0.9.1 Test: FM Deemphasis Filter 2P EPYC 7742 2P 7742 2P Repeat 80 160 240 320 400 SE +/- 2.83, N = 4 SE +/- 0.15, N = 3 SE +/- 0.12, N = 3 343.0 346.8 347.1
OpenBenchmarking.org MiB/s, More Is Better LuaRadio 0.9.1 Test: Hilbert Transform 2P EPYC 7742 2P 7742 2P Repeat 20 40 60 80 100 SE +/- 0.11, N = 4 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 84.4 84.4 84.6
OpenBenchmarking.org MiB/s, More Is Better LuaRadio 0.9.1 Test: Complex Phase 2P EPYC 7742 2P 7742 2P Repeat 120 240 360 480 600 SE +/- 0.72, N = 4 SE +/- 0.59, N = 3 SE +/- 0.76, N = 3 532.5 532.7 534.5
GNU Radio GNU Radio is a free software development toolkit providing signal processing blocks to implement software-defined radios (SDR) and signal processing systems. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MiB/s, More Is Better GNU Radio Test: Five Back to Back FIR Filters 2P EPYC 7742 2P 7742 2P Repeat 90 180 270 360 450 SE +/- 12.19, N = 9 SE +/- 11.37, N = 9 SE +/- 4.29, N = 5 433.2 400.8 423.3 1. 3.8.1.0
OpenBenchmarking.org MiB/s, More Is Better GNU Radio Test: Signal Source (Cosine) 2P EPYC 7742 2P 7742 2P Repeat 700 1400 2100 2800 3500 SE +/- 19.01, N = 9 SE +/- 23.25, N = 9 SE +/- 29.08, N = 5 3040.5 3032.8 3090.2 1. 3.8.1.0
OpenBenchmarking.org MiB/s, More Is Better GNU Radio Test: FIR Filter 2P EPYC 7742 2P 7742 2P Repeat 120 240 360 480 600 SE +/- 1.19, N = 9 SE +/- 0.84, N = 9 SE +/- 0.41, N = 5 555.8 554.9 555.9 1. 3.8.1.0
OpenBenchmarking.org MiB/s, More Is Better GNU Radio Test: IIR Filter 2P EPYC 7742 2P 7742 2P Repeat 110 220 330 440 550 SE +/- 1.12, N = 9 SE +/- 0.97, N = 9 SE +/- 0.31, N = 5 505.0 506.9 506.9 1. 3.8.1.0
OpenBenchmarking.org MiB/s, More Is Better GNU Radio Test: FM Deemphasis Filter 2P EPYC 7742 2P 7742 2P Repeat 160 320 480 640 800 SE +/- 8.43, N = 9 SE +/- 8.87, N = 9 SE +/- 17.28, N = 5 751.2 744.6 747.2 1. 3.8.1.0
OpenBenchmarking.org MiB/s, More Is Better GNU Radio Test: Hilbert Transform 2P EPYC 7742 2P 7742 2P Repeat 90 180 270 360 450 SE +/- 0.53, N = 9 SE +/- 0.61, N = 9 SE +/- 0.91, N = 5 436.2 436.5 437.6 1. 3.8.1.0
Blender Blender is an open-source 3D creation and modeling software project. This test is of Blender's Cycles benchmark with various sample files. GPU computing via OpenCL, NVIDIA OptiX, and NVIDIA CUDA is supported. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: BMW27 - Compute: CPU-Only 2 x AMD EPYC 7742 64-Core 6 12 18 24 30 SE +/- 0.22, N = 3 24.22
OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: Pabellon Barcelona - Compute: CPU-Only 2 x AMD EPYC 7742 64-Core 14 28 42 56 70 SE +/- 0.17, N = 3 64.56
IOR IOR is a parallel I/O storage benchmark making use of MPI with a particular focus on HPC (High Performance Computing) systems. IOR is developed at the Lawrence Livermore National Laboratory (LLNL). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 32MB - Disk Target: Default Test Directory 7742 2P Repeat 100 200 300 400 500 SE +/- 2.10, N = 3 461.38 MIN: 406.14 / MAX: 1020.59 1. (CC) gcc options: -O2 -lm -pthread -lmpi
Zstd Compression This test measures the time needed to compress/decompress a sample file (a FreeBSD disk image - FreeBSD-12.2-RELEASE-amd64-memstick.img) using Zstd compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.9 Compression Level: 3 - Compression Speed 7742 2P Repeat 1100 2200 3300 4400 5500 SE +/- 62.74, N = 3 5053.7 1. (CC) gcc options: -O3 -pthread -lz -llzma
JPEG XL Decoding The JPEG XL Image Coding System is designed to provide next-generation JPEG image capabilities with JPEG XL offering better image quality and compression over legacy JPEG. This test profile is suited for JPEG XL decode performance testing to PNG output file, the pts/jpexl test is for encode performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better JPEG XL Decoding 0.3.1 CPU Threads: 1 7742 2P Repeat 8 16 24 32 40 SE +/- 0.11, N = 3 32.97
OpenBenchmarking.org Mflops, More Is Better LuaJIT 2.1-git Test: Fast Fourier Transform 7742 2P Repeat 50 100 150 200 250 SE +/- 0.69, N = 3 210.57 1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -U_FORTIFY_SOURCE -fno-stack-protector
OpenBenchmarking.org Mflops, More Is Better LuaJIT 2.1-git Test: Sparse Matrix Multiply 7742 2P Repeat 200 400 600 800 1000 SE +/- 6.13, N = 3 1008.92 1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -U_FORTIFY_SOURCE -fno-stack-protector
OpenBenchmarking.org Mflops, More Is Better LuaJIT 2.1-git Test: Dense LU Matrix Factorization 7742 2P Repeat 600 1200 1800 2400 3000 SE +/- 173.31, N = 3 2811.62 1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -U_FORTIFY_SOURCE -fno-stack-protector
OpenBenchmarking.org Mflops, More Is Better LuaJIT 2.1-git Test: Jacobi Successive Over-Relaxation 7742 2P Repeat 400 800 1200 1600 2000 SE +/- 0.26, N = 3 1644.11 1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -U_FORTIFY_SOURCE -fno-stack-protector
EPYC 7742 2P Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vDisk Notes: NONE / errors=remount-ro,relatime,rw / Block Size: 4096Processor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0x8301034Java Notes: OpenJDK Runtime Environment (build 11.0.10+9-Ubuntu-0ubuntu1.20.04)Python Notes: Python 2.7.18 + Python 3.8.5Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 9 March 2021 19:03 by user root.
2P Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0x8301034Python Notes: Python 2.7.18 + Python 3.8.5Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 11 March 2021 05:52 by user root.
2 x AMD EPYC 7742 64-Core Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0x8301034Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 11 March 2021 07:50 by user root.
7742 2P Repeat Processor: 2 x AMD EPYC 7742 64-Core @ 2.25GHz (128 Cores / 256 Threads), Motherboard: Supermicro H11DSi-NT v2.00 (2.1 BIOS), Chipset: AMD Starship/Matisse, Memory: 16 x 8192 MB DDR4-3200MT/s HMA81GR7CJR8N-XN, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP, Graphics: ASPEED, Monitor: VGA HDMI, Network: 2 x Intel 10G X550T
OS: Ubuntu 20.04, Kernel: 5.8.0-44-generic (x86_64), Display Server: X Server 1.20.8, Compiler: GCC 9.3.0, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vDisk Notes: NONE / errors=remount-ro,relatime,rw / Block Size: 4096Processor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0x8301034Java Notes: OpenJDK Runtime Environment (build 11.0.10+9-Ubuntu-0ubuntu1.20.04)Python Notes: Python 2.7.18 + Python 3.8.5Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 11 March 2021 11:06 by user root.