2990wx-2021-amd AMD Ryzen Threadripper 2990WX 32-Core testing with a ASUS ROG ZENITH EXTREME (1701 BIOS) and Gigabyte AMD Radeon RX 470/480/570/570X/580/580X/590 4GB on Ubuntu 20.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2101259-HA-2990WX20208&sro&grr .
2990wx-2021-amd Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Vulkan Compiler File-System Screen Resolution 1 1a 2 3 4 5 AMD Ryzen Threadripper 2990WX 32-Core @ 3.00GHz (32 Cores / 64 Threads) ASUS ROG ZENITH EXTREME (1701 BIOS) AMD 17h 32GB Samsung SSD 970 EVO 500GB + 250GB Western Digital WDS250G2X0C-00L350 Gigabyte AMD Radeon RX 470/480/570/570X/580/580X/590 4GB (1244/1750MHz) Realtek ALC1220 LG Ultra HD Intel I211 + Qualcomm Atheros QCA6174 802.11ac + Wilocity Wil6200 802.11ad Ubuntu 20.10 5.8.0-34-generic (x86_64) GNOME Shell 3.38.1 X Server 1.20.9 modesetting 1.20.9 4.6 Mesa 20.2.1 (LLVM 11.0.0) 1.2.131 GCC 10.2.0 ext4 1920x1080 OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Disk Details - 1, 2, 3, 4, 5: NONE / errors=remount-ro,relatime,rw / Block Size: 4096 Processor Details - Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x800820d Python Details - Python 3.8.6 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + srbds: Not affected + tsx_async_abort: Not affected Kernel Details - 4, 5: Transparent Huge Pages: madvise
2990wx-2021-amd openfoam: Motorbike 30M openfoam: Motorbike 60M qe: AUSURF112 relion: Basic - CPU lammps: 20k Atoms cp2k: Fayalite-FIST Data mnn: inception-v3 mnn: mobilenet-v1-1.0 mnn: MobileNetV2_224 mnn: resnet-v2-50 mnn: SqueezeNetV1.0 ior: 32MB - Default Test Directory kripke: cloverleaf: Lagrangian-Eulerian Hydrodynamics onnx: yolov4 - OpenMP CPU onnx: bertsquad-10 - OpenMP CPU onnx: shufflenet-v2-10 - OpenMP CPU npb: LU.C gcrypt: npb: IS.D onnx: fcn-resnet101-11 - OpenMP CPU onnx: super-resolution-10 - OpenMP CPU ior: 16MB - Default Test Directory dav1d: Chimera 1080p 10-bit ior: 8MB - Default Test Directory qmcpack: simple-H2O npb: CG.C ior: 4MB - Default Test Directory build-godot: Time To Compile npb: EP.D amg: financebench: Bonds OpenMP dav1d: Summer Nature 4K quantlib: cpuminer-opt: LBC, LBRY Credits npb: FT.C rav1e: 5 cpuminer-opt: Deepcoin rav1e: 1 npb: MG.C financebench: Repo OpenMP rav1e: 6 lzbench: XZ 0 - Decompression lzbench: XZ 0 - Compression lzbench: Libdeflate 1 - Compression cpuminer-opt: Garlicoin cpuminer-opt: Magi etcpak: ETC2 rav1e: 10 cpuminer-opt: Triple SHA-256, Onecoin cpuminer-opt: Blake-2 S cpuminer-opt: Skeincoin cpuminer-opt: Quad SHA-256, Pyrite cpuminer-opt: Myriad-Groestl cpuminer-opt: Ringcoin cpuminer-opt: x25x synthmark: VoiceMark_100 lzbench: Crush 0 - Decompression lzbench: Crush 0 - Compression lzbench: Brotli 2 - Decompression lzbench: Brotli 2 - Compression cython-bench: N-Queens etcpak: ETC1 + Dithering lzbench: Brotli 0 - Decompression lzbench: Brotli 0 - Compression lzbench: Zstd 1 - Decompression lzbench: Zstd 1 - Compression lzbench: Zstd 8 - Decompression lzbench: Zstd 8 - Compression dav1d: Chimera 1080p tnn: CPU - MobileNet v2 etcpak: ETC1 tnn: CPU - SqueezeNet v1.1 lammps: Rhodopsin Protein ior: 2MB - Default Test Directory dav1d: Summer Nature 1080p npb: EP.C etcpak: DXT1 1 1a 2 3 4 5 76.57 726.90 1705.65 1823.688 15.352 1474.831 48.514 4.339 5.672 38.329 9.459 492.81 26877810 119.54 155 122 5011 54 2211 492.78 117.16 524.14 36.305 623.83 83.675 401359967 206.02 1.002 0.340 1.322 154.297 2.957 594.561 227.802 533.91 281.081 247.713 251.689 13.008 958.85 555.39 1599.218 40721.51 216.058 751.00 7685.88 1740.43 58307.904948 2320.7 45567 21668.10 17059 17335.82 42858.381510 108 37 203 4590.08 1172.19 221130 577410 137040 166940 10123 2936.88 819.47 481 94 662 196 26.033 577 498 1583 526 1765 94 1733.78 40.81 1710.86 1793.741 15.437 48.290 4.336 5.670 37.812 9.208 480.75 27358503 125.43 157 128 5487 39921.13 217.833 770.10 54 2223 490.50 117.18 535.92 37.346 7203.14 592.02 82.317 1740.55 391813467 58878.983073 204.27 2286.0 45570 21182.51 0.993 16883 0.337 16587.15 42328.953125 1.307 109 37 204 4639.97 1149.83 154.896 2.885 218553 571013 135643 165833 10130 2940.54 839.46 593.545 481 94 664 196 25.862 230.993 569 489 1572 526 1786 94 540.87 279.843 248.786 251.540 12.766 781.51 548.65 1741.25 1615.224 1687.65 1798.427 14.799 471.83 132.07 496.04 117.72 529.36 37.455 582.93 396403133 202.23 0.995 0.338 1.308 154.923 2.911 229.770 542.26 247.557 12.475 753.94 552.84 1607.366 1677.48 1767.925 15.013 1447.579 47.444 4.408 5.621 37.935 9.044 813.49 25710838 86.58 183 213 6400 39758.79 216.246 793.83 59 2362 808.18 117.85 594.69 35.900 7217.86 687.41 81.894 1732.07 398053633 56415.078125 208.11 2288.6 46943 22314.71 1.014 16920 0.343 17192.72 41316.536458 1.329 109 37 203 4521.83 1172.74 155.118 2.924 220147 575833 131437 165943 10210 2929.72 830.66 587.965 480 95 666 197 26.152 230.942 572 492 1572 525 1787 94 540.92 289.279 248.611 251.417 12.589 791.04 554.60 1750.21 1617.869 1673.03 1781.073 15.061 48.246 4.340 5.632 37.835 9.126 506.83 25857623 86.15 181 216 6498 41730.74 216.768 811.98 61 2341 527.67 118.30 536.25 35.624 7441.03 654.19 82.105 1737.33 386077433 56829.992187 209.54 2296.2 45451 22032.23 1.008 16947 0.344 16629.67 41135.450521 1.320 108 37 206 4520.30 1171.44 154.303 2.908 220797 565767 132140 166993 10127 2950.68 830.89 589.778 481 94 640 196 25.821 229.607 572 494 1533 525 1784 94 539.62 288.426 248.733 251.339 12.832 961.66 552.65 1743.29 1619.436 OpenBenchmarking.org
OpenFOAM Input: Motorbike 30M OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 8 Input: Motorbike 30M 1 2 20 40 60 80 100 SE +/- 10.16, N = 8 SE +/- 18.67, N = 3 76.57 40.81 1. (CXX) g++ options: -std=c++11 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm
OpenFOAM Input: Motorbike 60M OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 8 Input: Motorbike 60M 1 160 320 480 640 800 SE +/- 10.63, N = 9 726.90 1. (CXX) g++ options: -std=c++11 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm
Quantum ESPRESSO Input: AUSURF112 OpenBenchmarking.org Seconds, Fewer Is Better Quantum ESPRESSO 6.7 Input: AUSURF112 1 2 3 4 5 400 800 1200 1600 2000 SE +/- 19.15, N = 7 SE +/- 26.20, N = 3 SE +/- 3.23, N = 3 SE +/- 23.00, N = 4 SE +/- 5.57, N = 3 1705.65 1710.86 1687.65 1677.48 1673.03 1. (F9X) gfortran options: -lopenblas -lFoX_dom -lFoX_sax -lFoX_wxml -lFoX_common -lFoX_utils -lFoX_fsys -lfftw3 -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent -levent_pthreads -lutil -lm -lrt -lz
RELION Test: Basic - Device: CPU OpenBenchmarking.org Seconds, Fewer Is Better RELION 3.1.1 Test: Basic - Device: CPU 1 2 3 4 5 400 800 1200 1600 2000 SE +/- 15.01, N = 3 SE +/- 2.64, N = 3 SE +/- 5.61, N = 3 SE +/- 9.74, N = 3 SE +/- 5.46, N = 3 1823.69 1793.74 1798.43 1767.93 1781.07 1. (CXX) g++ options: -fopenmp -std=c++0x -O3 -rdynamic -ldl -ltiff -lfftw3f -lfftw3 -lpng -pthread -lmpi_cxx -lmpi
LAMMPS Molecular Dynamics Simulator Model: 20k Atoms OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: 20k Atoms 1 2 3 4 5 4 8 12 16 20 SE +/- 0.03, N = 3 SE +/- 0.09, N = 3 SE +/- 0.05, N = 3 SE +/- 0.02, N = 3 SE +/- 0.16, N = 3 15.35 15.44 14.80 15.01 15.06 1. (CXX) g++ options: -O3 -pthread -lm
CP2K Molecular Dynamics Fayalite-FIST Data OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 8.1 Fayalite-FIST Data 1 4 300 600 900 1200 1500 1474.83 1447.58
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: inception-v3 1 2 4 5 11 22 33 44 55 SE +/- 0.32, N = 15 SE +/- 0.35, N = 15 SE +/- 0.12, N = 3 SE +/- 0.45, N = 15 48.51 48.29 47.44 48.25 MIN: 44.81 / MAX: 103.24 MIN: 44.15 / MAX: 138.3 MIN: 45.7 / MAX: 117.61 MIN: 43.95 / MAX: 109.39 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: mobilenet-v1-1.0 1 2 4 5 0.9918 1.9836 2.9754 3.9672 4.959 SE +/- 0.050, N = 15 SE +/- 0.067, N = 15 SE +/- 0.099, N = 3 SE +/- 0.036, N = 15 4.339 4.336 4.408 4.340 MIN: 3.45 / MAX: 35.08 MIN: 3.43 / MAX: 36.21 MIN: 3.88 / MAX: 23.42 MIN: 3.82 / MAX: 35.09 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: MobileNetV2_224 1 2 4 5 1.2762 2.5524 3.8286 5.1048 6.381 SE +/- 0.062, N = 15 SE +/- 0.073, N = 15 SE +/- 0.138, N = 3 SE +/- 0.055, N = 15 5.672 5.670 5.621 5.632 MIN: 5.22 / MAX: 6.84 MIN: 5.12 / MAX: 13.18 MIN: 5.33 / MAX: 6.27 MIN: 5.17 / MAX: 6.36 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: resnet-v2-50 1 2 4 5 9 18 27 36 45 SE +/- 0.25, N = 15 SE +/- 0.26, N = 15 SE +/- 0.29, N = 3 SE +/- 0.30, N = 15 38.33 37.81 37.94 37.84 MIN: 35.25 / MAX: 98.22 MIN: 34.87 / MAX: 128.92 MIN: 36.11 / MAX: 89.49 MIN: 35 / MAX: 102.86 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: SqueezeNetV1.0 1 2 4 5 3 6 9 12 15 SE +/- 0.119, N = 15 SE +/- 0.104, N = 15 SE +/- 0.149, N = 3 SE +/- 0.086, N = 15 9.459 9.208 9.044 9.126 MIN: 8.23 / MAX: 19.92 MIN: 8.24 / MAX: 22.52 MIN: 8.26 / MAX: 12.92 MIN: 8.22 / MAX: 19.08 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
IOR Block Size: 32MB - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 32MB - Disk Target: Default Test Directory 1 2 3 4 5 200 400 600 800 1000 SE +/- 4.96, N = 12 SE +/- 5.02, N = 7 SE +/- 4.61, N = 3 SE +/- 6.78, N = 3 SE +/- 6.72, N = 5 492.81 480.75 471.83 813.49 506.83 MIN: 391.29 / MAX: 1073.97 MIN: 396.65 / MAX: 1002.46 MIN: 411.76 / MAX: 928.83 MIN: 299.5 / MAX: 1128.35 MIN: 420 / MAX: 1027.38 1. (CC) gcc options: -O2 -lm -pthread -lmpi
Kripke OpenBenchmarking.org Throughput FoM, More Is Better Kripke 1.2.4 1 2 4 5 6M 12M 18M 24M 30M SE +/- 46576.08, N = 3 SE +/- 496569.70, N = 12 SE +/- 432230.86, N = 12 SE +/- 621600.77, N = 12 26877810 27358503 25710838 25857623 1. (CXX) g++ options: -O3 -fopenmp
CloverLeaf Lagrangian-Eulerian Hydrodynamics OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf Lagrangian-Eulerian Hydrodynamics 1 2 3 4 5 30 60 90 120 150 SE +/- 0.72, N = 3 SE +/- 4.21, N = 9 SE +/- 3.21, N = 9 SE +/- 2.71, N = 12 SE +/- 1.40, N = 15 119.54 125.43 132.07 86.58 86.15 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
ONNX Runtime Model: yolov4 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: yolov4 - Device: OpenMP CPU 1 2 4 5 40 80 120 160 200 SE +/- 1.78, N = 12 SE +/- 2.09, N = 4 SE +/- 2.10, N = 12 SE +/- 1.96, N = 3 155 157 183 181 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
ONNX Runtime Model: bertsquad-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: bertsquad-10 - Device: OpenMP CPU 1 2 4 5 50 100 150 200 250 SE +/- 2.08, N = 12 SE +/- 1.50, N = 3 SE +/- 1.92, N = 3 SE +/- 2.59, N = 12 122 128 213 216 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
ONNX Runtime Model: shufflenet-v2-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: shufflenet-v2-10 - Device: OpenMP CPU 1 2 4 5 1400 2800 4200 5600 7000 SE +/- 167.82, N = 12 SE +/- 55.38, N = 3 SE +/- 69.07, N = 12 SE +/- 65.22, N = 3 5011 5487 6400 6498 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
NAS Parallel Benchmarks Test / Class: LU.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: LU.C 1a 2 4 5 9K 18K 27K 36K 45K SE +/- 705.58, N = 15 SE +/- 824.34, N = 15 SE +/- 904.51, N = 15 SE +/- 555.68, N = 15 40721.51 39921.13 39758.79 41730.74 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.3
Gcrypt Library OpenBenchmarking.org Seconds, Fewer Is Better Gcrypt Library 1.9 1a 2 4 5 50 100 150 200 250 SE +/- 0.45, N = 3 SE +/- 2.05, N = 3 SE +/- 0.72, N = 3 SE +/- 1.12, N = 3 216.06 217.83 216.25 216.77 1. (CC) gcc options: -O2 -fvisibility=hidden
NAS Parallel Benchmarks Test / Class: IS.D OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: IS.D 1a 2 4 5 200 400 600 800 1000 SE +/- 17.71, N = 15 SE +/- 14.63, N = 12 SE +/- 8.96, N = 15 SE +/- 9.93, N = 3 751.00 770.10 793.83 811.98 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.3
ONNX Runtime Model: fcn-resnet101-11 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: fcn-resnet101-11 - Device: OpenMP CPU 1 2 4 5 14 28 42 56 70 SE +/- 0.17, N = 3 SE +/- 0.17, N = 3 SE +/- 0.58, N = 3 54 54 59 61 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: super-resolution-10 - Device: OpenMP CPU 1 2 4 5 500 1000 1500 2000 2500 SE +/- 4.69, N = 3 SE +/- 8.92, N = 3 SE +/- 7.60, N = 3 SE +/- 22.29, N = 3 2211 2223 2362 2341 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
IOR Block Size: 16MB - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 16MB - Disk Target: Default Test Directory 1 2 3 4 5 200 400 600 800 1000 SE +/- 3.19, N = 3 SE +/- 6.50, N = 4 SE +/- 1.33, N = 3 SE +/- 3.29, N = 3 SE +/- 2.84, N = 3 492.78 490.50 496.04 808.18 527.67 MIN: 391.54 / MAX: 1031.18 MIN: 367.13 / MAX: 1024.21 MIN: 422.66 / MAX: 988.54 MIN: 297.46 / MAX: 1133.39 MIN: 422.53 / MAX: 1035.04 1. (CC) gcc options: -O2 -lm -pthread -lmpi
dav1d Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Chimera 1080p 10-bit 1 2 3 4 5 30 60 90 120 150 SE +/- 0.47, N = 3 SE +/- 0.07, N = 3 SE +/- 0.26, N = 3 SE +/- 0.13, N = 3 SE +/- 0.22, N = 3 117.16 117.18 117.72 117.85 118.30 MIN: 80.9 / MAX: 196.36 MIN: 80.97 / MAX: 191.46 MIN: 81.13 / MAX: 195.69 MIN: 81.72 / MAX: 196.43 MIN: 81.84 / MAX: 191.83 1. (CC) gcc options: -pthread
IOR Block Size: 8MB - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 8MB - Disk Target: Default Test Directory 1 2 3 4 5 130 260 390 520 650 SE +/- 8.56, N = 3 SE +/- 8.80, N = 3 SE +/- 6.43, N = 6 SE +/- 7.50, N = 3 SE +/- 4.89, N = 15 524.14 535.92 529.36 594.69 536.25 MIN: 399.31 / MAX: 1031.12 MIN: 424.77 / MAX: 1080.17 MIN: 393.92 / MAX: 1033.73 MIN: 330.92 / MAX: 1124.59 MIN: 376.61 / MAX: 1090.13 1. (CC) gcc options: -O2 -lm -pthread -lmpi
QMCPACK Input: simple-H2O OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 3.10 Input: simple-H2O 1 2 3 4 5 9 18 27 36 45 SE +/- 0.57, N = 3 SE +/- 0.53, N = 3 SE +/- 0.49, N = 15 SE +/- 0.44, N = 15 SE +/- 0.39, N = 7 36.31 37.35 37.46 35.90 35.62 1. (CXX) g++ options: -fopenmp -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -march=native -O3 -fomit-frame-pointer -ffast-math -pthread -lm
NAS Parallel Benchmarks Test / Class: CG.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: CG.C 1a 2 4 5 1600 3200 4800 6400 8000 SE +/- 214.27, N = 15 SE +/- 230.46, N = 15 SE +/- 203.10, N = 15 SE +/- 198.36, N = 15 7685.88 7203.14 7217.86 7441.03 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.3
IOR Block Size: 4MB - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 4MB - Disk Target: Default Test Directory 1 2 3 4 5 150 300 450 600 750 SE +/- 37.33, N = 15 SE +/- 6.62, N = 12 SE +/- 4.85, N = 15 SE +/- 3.96, N = 3 SE +/- 35.87, N = 15 623.83 592.02 582.93 687.41 654.19 MIN: 310.99 / MAX: 1067.3 MIN: 332.2 / MAX: 1089.17 MIN: 366.23 / MAX: 1049.9 MIN: 279.64 / MAX: 1117.59 MIN: 328.97 / MAX: 1093.06 1. (CC) gcc options: -O2 -lm -pthread -lmpi
Timed Godot Game Engine Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Godot Game Engine Compilation 3.2.3 Time To Compile 1 2 4 5 20 40 60 80 100 SE +/- 0.87, N = 3 SE +/- 0.43, N = 2 SE +/- 0.19, N = 3 SE +/- 0.38, N = 3 83.68 82.32 81.89 82.11
NAS Parallel Benchmarks Test / Class: EP.D OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.D 1a 2 4 5 400 800 1200 1600 2000 SE +/- 0.38, N = 3 SE +/- 0.69, N = 3 SE +/- 2.98, N = 3 SE +/- 2.63, N = 3 1740.43 1740.55 1732.07 1737.33 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.3
Algebraic Multi-Grid Benchmark OpenBenchmarking.org Figure Of Merit, More Is Better Algebraic Multi-Grid Benchmark 1.2 1 2 3 4 5 90M 180M 270M 360M 450M SE +/- 4525834.83, N = 3 SE +/- 1988092.57, N = 3 SE +/- 2489918.59, N = 3 SE +/- 277098.08, N = 3 SE +/- 3176655.21, N = 3 401359967 391813467 396403133 398053633 386077433 1. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -pthread -lmpi
FinanceBench Benchmark: Bonds OpenMP OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-07-25 Benchmark: Bonds OpenMP 1a 2 4 5 13K 26K 39K 52K 65K SE +/- 196.31, N = 3 SE +/- 203.39, N = 3 SE +/- 106.37, N = 3 SE +/- 43.63, N = 3 58307.90 58878.98 56415.08 56829.99 1. (CXX) g++ options: -O3 -march=native -fopenmp
dav1d Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Summer Nature 4K 1 2 3 4 5 50 100 150 200 250 SE +/- 2.02, N = 15 SE +/- 2.03, N = 15 SE +/- 1.90, N = 15 SE +/- 2.27, N = 7 SE +/- 0.49, N = 3 206.02 204.27 202.23 208.11 209.54 MIN: 129.35 / MAX: 225.85 MIN: 127.4 / MAX: 222.46 MIN: 126.91 / MAX: 221.77 MIN: 133.38 / MAX: 225.17 MIN: 140.28 / MAX: 222.38 1. (CC) gcc options: -pthread
QuantLib OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.21 1a 2 4 5 500 1000 1500 2000 2500 SE +/- 4.31, N = 3 SE +/- 23.91, N = 8 SE +/- 27.83, N = 6 SE +/- 27.47, N = 6 2320.7 2286.0 2288.6 2296.2 1. (CXX) g++ options: -O3 -march=native -rdynamic
Cpuminer-Opt Algorithm: LBC, LBRY Credits OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.15.5 Algorithm: LBC, LBRY Credits 1a 2 4 5 10K 20K 30K 40K 50K SE +/- 668.84, N = 3 SE +/- 502.13, N = 3 SE +/- 536.95, N = 3 SE +/- 381.65, N = 15 45567 45570 46943 45451 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
NAS Parallel Benchmarks Test / Class: FT.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: FT.C 1a 2 4 5 5K 10K 15K 20K 25K SE +/- 397.10, N = 15 SE +/- 443.87, N = 15 SE +/- 137.92, N = 3 SE +/- 47.70, N = 3 21668.10 21182.51 22314.71 22032.23 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.3
rav1e Speed: 5 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 5 1 2 3 4 5 0.2282 0.4564 0.6846 0.9128 1.141 SE +/- 0.004, N = 3 SE +/- 0.003, N = 3 SE +/- 0.002, N = 3 SE +/- 0.002, N = 3 SE +/- 0.003, N = 3 1.002 0.993 0.995 1.014 1.008
Cpuminer-Opt Algorithm: Deepcoin OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.15.5 Algorithm: Deepcoin 1a 2 4 5 4K 8K 12K 16K 20K SE +/- 119.53, N = 14 SE +/- 27.28, N = 3 SE +/- 11.55, N = 3 SE +/- 49.78, N = 3 17059 16883 16920 16947 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
rav1e Speed: 1 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 1 1 2 3 4 5 0.0774 0.1548 0.2322 0.3096 0.387 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 SE +/- 0.002, N = 3 SE +/- 0.003, N = 3 SE +/- 0.001, N = 3 0.340 0.337 0.338 0.343 0.344
NAS Parallel Benchmarks Test / Class: MG.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: MG.C 1a 2 4 5 4K 8K 12K 16K 20K SE +/- 398.98, N = 15 SE +/- 500.68, N = 15 SE +/- 312.61, N = 15 SE +/- 531.61, N = 15 17335.82 16587.15 17192.72 16629.67 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.3
FinanceBench Benchmark: Repo OpenMP OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-07-25 Benchmark: Repo OpenMP 1a 2 4 5 9K 18K 27K 36K 45K SE +/- 192.21, N = 3 SE +/- 566.70, N = 3 SE +/- 57.95, N = 3 SE +/- 260.65, N = 3 42858.38 42328.95 41316.54 41135.45 1. (CXX) g++ options: -O3 -march=native -fopenmp
rav1e Speed: 6 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 6 1 2 3 4 5 0.299 0.598 0.897 1.196 1.495 SE +/- 0.003, N = 3 SE +/- 0.006, N = 3 SE +/- 0.004, N = 3 SE +/- 0.005, N = 3 SE +/- 0.003, N = 3 1.322 1.307 1.308 1.329 1.320
lzbench Test: XZ 0 - Process: Decompression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: XZ 0 - Process: Decompression 1a 2 4 5 20 40 60 80 100 108 109 109 108 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: XZ 0 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: XZ 0 - Process: Compression 1a 2 4 5 9 18 27 36 45 SE +/- 0.33, N = 3 37 37 37 37 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: Libdeflate 1 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Libdeflate 1 - Process: Compression 1a 2 4 5 50 100 150 200 250 SE +/- 3.06, N = 3 SE +/- 3.06, N = 3 SE +/- 2.87, N = 4 SE +/- 2.49, N = 15 203 204 203 206 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
Cpuminer-Opt Algorithm: Garlicoin OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.15.5 Algorithm: Garlicoin 1a 2 4 5 1000 2000 3000 4000 5000 SE +/- 68.57, N = 4 SE +/- 65.81, N = 3 SE +/- 2.46, N = 3 SE +/- 0.83, N = 3 4590.08 4639.97 4521.83 4520.30 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
Cpuminer-Opt Algorithm: Magi OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.15.5 Algorithm: Magi 1a 2 4 5 300 600 900 1200 1500 SE +/- 0.84, N = 3 SE +/- 14.98, N = 4 SE +/- 1.16, N = 3 SE +/- 3.50, N = 3 1172.19 1149.83 1172.74 1171.44 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
Etcpak Configuration: ETC2 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC2 1 2 3 4 5 30 60 90 120 150 SE +/- 0.63, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 154.30 154.90 154.92 155.12 154.30 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
rav1e Speed: 10 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 10 1 2 3 4 5 0.6653 1.3306 1.9959 2.6612 3.3265 SE +/- 0.011, N = 3 SE +/- 0.009, N = 3 SE +/- 0.005, N = 3 SE +/- 0.010, N = 3 SE +/- 0.006, N = 3 2.957 2.885 2.911 2.924 2.908
Cpuminer-Opt Algorithm: Triple SHA-256, Onecoin OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.15.5 Algorithm: Triple SHA-256, Onecoin 1a 2 4 5 50K 100K 150K 200K 250K SE +/- 469.18, N = 3 SE +/- 612.27, N = 3 SE +/- 568.46, N = 3 SE +/- 877.54, N = 3 221130 218553 220147 220797 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
Cpuminer-Opt Algorithm: Blake-2 S OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.15.5 Algorithm: Blake-2 S 1a 2 4 5 120K 240K 360K 480K 600K SE +/- 3940.00, N = 3 SE +/- 3628.00, N = 3 SE +/- 3934.48, N = 3 SE +/- 5261.67, N = 3 577410 571013 575833 565767 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
Cpuminer-Opt Algorithm: Skeincoin OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.15.5 Algorithm: Skeincoin 1a 2 4 5 30K 60K 90K 120K 150K SE +/- 918.82, N = 3 SE +/- 1055.21, N = 3 SE +/- 1707.45, N = 3 SE +/- 441.63, N = 3 137040 135643 131437 132140 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
Cpuminer-Opt Algorithm: Quad SHA-256, Pyrite OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.15.5 Algorithm: Quad SHA-256, Pyrite 1a 2 4 5 40K 80K 120K 160K 200K SE +/- 222.71, N = 3 SE +/- 1021.96, N = 3 SE +/- 1373.44, N = 3 SE +/- 1729.38, N = 3 166940 165833 165943 166993 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
Cpuminer-Opt Algorithm: Myriad-Groestl OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.15.5 Algorithm: Myriad-Groestl 1a 2 4 5 2K 4K 6K 8K 10K SE +/- 8.82, N = 3 SE +/- 17.32, N = 3 SE +/- 66.58, N = 3 SE +/- 12.02, N = 3 10123 10130 10210 10127 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
Cpuminer-Opt Algorithm: Ringcoin OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.15.5 Algorithm: Ringcoin 1a 2 4 5 600 1200 1800 2400 3000 SE +/- 6.53, N = 3 SE +/- 8.81, N = 3 SE +/- 8.05, N = 3 SE +/- 11.03, N = 3 2936.88 2940.54 2929.72 2950.68 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
Cpuminer-Opt Algorithm: x25x OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.15.5 Algorithm: x25x 1a 2 4 5 200 400 600 800 1000 SE +/- 7.85, N = 3 SE +/- 5.74, N = 3 SE +/- 1.18, N = 3 SE +/- 1.08, N = 3 819.47 839.46 830.66 830.89 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
Google SynthMark Test: VoiceMark_100 OpenBenchmarking.org Voices, More Is Better Google SynthMark 20201109 Test: VoiceMark_100 1 2 4 5 130 260 390 520 650 SE +/- 1.64, N = 3 SE +/- 1.23, N = 3 SE +/- 1.86, N = 3 SE +/- 1.79, N = 3 594.56 593.55 587.97 589.78 1. (CXX) g++ options: -lm -lpthread -std=c++11 -Ofast
lzbench Test: Crush 0 - Process: Decompression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Crush 0 - Process: Decompression 1a 2 4 5 100 200 300 400 500 SE +/- 0.33, N = 3 SE +/- 0.58, N = 3 481 481 480 481 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: Crush 0 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Crush 0 - Process: Compression 1a 2 4 5 20 40 60 80 100 SE +/- 0.33, N = 3 SE +/- 0.88, N = 3 SE +/- 0.67, N = 3 94 94 95 94 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: Brotli 2 - Process: Decompression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Brotli 2 - Process: Decompression 1a 2 4 5 140 280 420 560 700 SE +/- 4.26, N = 3 SE +/- 1.20, N = 3 SE +/- 0.67, N = 3 SE +/- 23.67, N = 3 662 664 666 640 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: Brotli 2 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Brotli 2 - Process: Compression 1a 2 4 5 40 80 120 160 200 SE +/- 0.33, N = 3 196 196 197 196 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
Cython Benchmark Test: N-Queens OpenBenchmarking.org Seconds, Fewer Is Better Cython Benchmark 0.29.21 Test: N-Queens 1a 2 4 5 6 12 18 24 30 SE +/- 0.13, N = 3 SE +/- 0.25, N = 3 SE +/- 0.19, N = 3 SE +/- 0.08, N = 3 26.03 25.86 26.15 25.82
Etcpak Configuration: ETC1 + Dithering OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC1 + Dithering 1 2 3 4 5 50 100 150 200 250 SE +/- 2.59, N = 6 SE +/- 0.07, N = 3 SE +/- 1.28, N = 3 SE +/- 0.34, N = 3 SE +/- 0.32, N = 3 227.80 230.99 229.77 230.94 229.61 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
lzbench Test: Brotli 0 - Process: Decompression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Brotli 0 - Process: Decompression 1a 2 4 5 120 240 360 480 600 SE +/- 4.33, N = 3 SE +/- 2.00, N = 2 577 569 572 572 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: Brotli 0 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Brotli 0 - Process: Compression 1a 2 4 5 110 220 330 440 550 SE +/- 3.18, N = 3 SE +/- 3.00, N = 3 SE +/- 1.20, N = 3 SE +/- 1.15, N = 3 498 489 492 494 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: Zstd 1 - Process: Decompression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Zstd 1 - Process: Decompression 1a 2 4 5 300 600 900 1200 1500 SE +/- 1.15, N = 3 SE +/- 0.33, N = 3 SE +/- 2.65, N = 3 SE +/- 40.67, N = 3 1583 1572 1572 1533 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: Zstd 1 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Zstd 1 - Process: Compression 1a 2 4 5 110 220 330 440 550 SE +/- 1.15, N = 3 SE +/- 0.67, N = 3 SE +/- 0.33, N = 3 SE +/- 1.00, N = 3 526 526 525 525 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: Zstd 8 - Process: Decompression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Zstd 8 - Process: Decompression 1a 2 4 5 400 800 1200 1600 2000 SE +/- 0.33, N = 3 SE +/- 0.67, N = 3 SE +/- 2.33, N = 3 1765 1786 1787 1784 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: Zstd 8 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Zstd 8 - Process: Compression 1a 2 4 5 20 40 60 80 100 94 94 94 94 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
dav1d Video Input: Chimera 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Chimera 1080p 1 2 3 4 5 120 240 360 480 600 SE +/- 8.93, N = 3 SE +/- 1.71, N = 3 SE +/- 1.05, N = 3 SE +/- 2.60, N = 3 SE +/- 2.78, N = 3 533.91 540.87 542.26 540.92 539.62 MIN: 411.82 / MAX: 669.31 MIN: 423.28 / MAX: 669.12 MIN: 422.86 / MAX: 672.41 MIN: 423.15 / MAX: 672.98 MIN: 421.29 / MAX: 668.77 1. (CC) gcc options: -pthread
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: MobileNet v2 1 2 4 5 60 120 180 240 300 SE +/- 0.22, N = 3 SE +/- 0.81, N = 3 SE +/- 0.81, N = 3 SE +/- 0.44, N = 3 281.08 279.84 289.28 288.43 MIN: 263.7 / MAX: 327.87 MIN: 264.8 / MAX: 315.98 MIN: 265.62 / MAX: 312.79 MIN: 263.56 / MAX: 332.62 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
Etcpak Configuration: ETC1 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC1 1 2 3 4 5 50 100 150 200 250 SE +/- 1.00, N = 3 SE +/- 0.06, N = 3 SE +/- 1.16, N = 3 SE +/- 0.57, N = 3 SE +/- 0.45, N = 3 247.71 248.79 247.56 248.61 248.73 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: SqueezeNet v1.1 1 2 4 5 50 100 150 200 250 SE +/- 0.03, N = 3 SE +/- 0.15, N = 3 SE +/- 0.31, N = 3 SE +/- 0.07, N = 3 251.69 251.54 251.42 251.34 MIN: 250.86 / MAX: 254.16 MIN: 250.62 / MAX: 254.15 MIN: 250.45 / MAX: 260.46 MIN: 250.59 / MAX: 254.17 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
LAMMPS Molecular Dynamics Simulator Model: Rhodopsin Protein OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: Rhodopsin Protein 1 2 3 4 5 3 6 9 12 15 SE +/- 0.22, N = 13 SE +/- 0.25, N = 15 SE +/- 0.16, N = 15 SE +/- 0.22, N = 15 SE +/- 0.22, N = 15 13.01 12.77 12.48 12.59 12.83 1. (CXX) g++ options: -O3 -pthread -lm
IOR Block Size: 2MB - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 2MB - Disk Target: Default Test Directory 1 2 3 4 5 200 400 600 800 1000 SE +/- 3.76, N = 3 SE +/- 8.48, N = 3 SE +/- 10.17, N = 3 SE +/- 6.02, N = 3 SE +/- 3.23, N = 3 958.85 781.51 753.94 791.04 961.66 MIN: 837.5 / MAX: 1071.89 MIN: 347.79 / MAX: 1068.81 MIN: 291.39 / MAX: 1030.85 MIN: 324.26 / MAX: 1057.74 MIN: 792.79 / MAX: 1076.37 1. (CC) gcc options: -O2 -lm -pthread -lmpi
dav1d Video Input: Summer Nature 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Summer Nature 1080p 1 2 3 4 5 120 240 360 480 600 SE +/- 2.78, N = 3 SE +/- 2.92, N = 3 SE +/- 2.25, N = 3 SE +/- 1.32, N = 3 SE +/- 2.21, N = 3 555.39 548.65 552.84 554.60 552.65 MIN: 320.77 / MAX: 613.37 MIN: 328.88 / MAX: 602.05 MIN: 322.61 / MAX: 608.3 MIN: 342.81 / MAX: 607.82 MIN: 338.43 / MAX: 607.43 1. (CC) gcc options: -pthread
NAS Parallel Benchmarks Test / Class: EP.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.C 1a 2 4 5 400 800 1200 1600 2000 SE +/- 7.09, N = 3 SE +/- 0.68, N = 3 SE +/- 6.67, N = 3 SE +/- 4.65, N = 3 1733.78 1741.25 1750.21 1743.29 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.3
Etcpak Configuration: DXT1 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: DXT1 1 2 3 4 5 300 600 900 1200 1500 SE +/- 0.37, N = 3 SE +/- 2.00, N = 3 SE +/- 9.61, N = 3 SE +/- 0.57, N = 3 SE +/- 1.41, N = 3 1599.22 1615.22 1607.37 1617.87 1619.44 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
Phoronix Test Suite v10.8.4