haswell-2021 Intel Core i3-4130 testing with a Intel DH87RL (RLH8710H.86A.0332.2018.1026.1448 BIOS) and Intel Haswell 2GB on Ubuntu 19.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2102182-HA-HASWELL2093&grw&rdt .
haswell-2021 Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Vulkan Compiler File-System Screen Resolution 1 1a 2 3 Intel Core i3-4130 @ 3.40GHz (2 Cores / 4 Threads) Intel DH87RL (RLH8710H.86A.0332.2018.1026.1448 BIOS) Intel 4th Gen Core DRAM 2048MB 128GB TOSHIBA THNSNH12 Intel Haswell 2GB (1150MHz) Intel Xeon E3-1200 v3/4th DELL S2409W Intel I217-V Ubuntu 19.10 5.5.0-rc2-patched (x86_64) 20200115 GNOME Shell 3.34.1 X Server 1.20.5 4.5 Mesa 19.2.8 1.1.102 GCC 9.2.1 20191008 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-offload-targets=nvptx-none,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave - CPU Microcode: 0x28 - Thermald 1.9 Python Details - Python 2.7.17 + Python 3.7.5 Security Details - itlb_multihit: KVM: Mitigation of Split huge pages + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT vulnerable + mds: Mitigation of Clear buffers; SMT vulnerable + meltdown: Mitigation of PTI + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + tsx_async_abort: Not affected
haswell-2021 toybrot: OpenMP toybrot: C++ Tasks toybrot: C++ Threads clomp: Static OMP Speedup ngspice: C2670 ngspice: C7552 encode-ape: WAV To APE encode-opus: WAV To Opus Encode encode-wavpack: WAV To WavPack etcpak: DXT1 etcpak: ETC1 etcpak: ETC2 etcpak: ETC1 + Dithering jpegxl: PNG - 5 jpegxl: JPEG - 5 jpegxl: JPEG - 7 jpegxl: PNG - 7 jpegxl-decode: 1 jpegxl-decode: All encode-ogg: WAV To Ogg synthmark: VoiceMark_100 gcrypt: cloverleaf: Lagrangian-Eulerian Hydrodynamics mnn: SqueezeNetV1.0 mnn: resnet-v2-50 mnn: MobileNetV2_224 mnn: mobilenet-v1-1.0 mnn: inception-v3 onnx: yolov4 - OpenMP CPU onnx: bertsquad-10 - OpenMP CPU onnx: fcn-resnet101-11 - OpenMP CPU onnx: shufflenet-v2-10 - OpenMP CPU onnx: super-resolution-10 - OpenMP CPU tnn: CPU - MobileNet v2 tnn: CPU - SqueezeNet v1.1 gromacs: water_GMX50_bare lammps: Rhodopsin Protein askap: tConvolve MT - Gridding askap: tConvolve MT - Degridding askap: tConvolve MPI - Degridding askap: tConvolve MPI - Gridding askap: tConvolve OpenMP - Gridding askap: tConvolve OpenMP - Degridding askap: Hogbom Clean OpenMP lulesh: qmcpack: simple-H2O dav1d: Chimera 1080p dav1d: Summer Nature 4K dav1d: Summer Nature 1080p dav1d: Chimera 1080p 10-bit build-eigen: Time To Compile redis: LPOP redis: SADD redis: LPUSH redis: GET redis: SET gnupg: 2.7GB Sample File Encryption unpack-firefox: firefox-84.0.source.tar.xz quantlib: 1 1a 2 3 311687 311235 311483 1.3 377.899 402.674 15.859 11.037 1042.598 230.807 135.577 216.238 17.85 27.55 27.52 21.50 37.59 28.495 545.077 280.534 554.66 1.164 441.30157 38.371 119.69 30.04 110.16 29.61 130.562 1579.6 311825 311270 311462 1.3 371.843 404.287 15.864 11.041 18.064 1033.119 229.892 135.663 219.343 17.78 27.20 27.10 1.57 21.27 37.39 28.304 541.774 280.190 548.58 23.312 109.585 12.758 14.591 172.913 87 104 15 5785 761 364.404 375.894 0.167 1.162 374.146 434.174 239.588 323.899 338.630 431.552 59.1427 519.95123 38.430 121.64 30.19 111.99 29.72 127.820 1733646.58 1480929.42 1138733.25 1696677.54 1319523.38 85.506 27.031 1573.9 311719 311310 311506 1.3 371.911 399.684 15.906 11.046 18.037 1048.037 230.989 135.987 219.439 17.80 27.52 27.33 1.57 21.48 37.74 28.262 543.819 282.351 544.00 21.819 111.048 12.841 14.981 172.572 87 106 15 5836 783 361.519 377.544 0.175 1.172 367.930 426.568 240.377 325.866 340.226 437.333 59.7027 445.61644 38.867 120.37 30.09 111.00 29.51 127.773 1163471.21 1471883.58 1147588.38 1553775.24 1320791.37 84.952 27.702 1579.7 311678 311303 311510 1.3 374.349 398.021 15.882 11.039 18.073 1040.956 229.533 135.305 217.796 17.83 27.51 27.39 1.60 21.44 37.53 28.277 544.821 280.702 553.07 22.147 111.431 13.488 14.816 176.035 86 103 15 5797 770 366.279 380.153 0.166 1.159 368.791 434.397 239.122 325.097 333.538 436.173 60.3150 461.70435 38.760 120.89 30.17 112.88 29.67 127.820 1170246.42 1443346.96 1132436.75 1579594.62 1330886.96 84.796 27.517 1578.7 OpenBenchmarking.org
toyBrot Fractal Generator Implementation: OpenMP OpenBenchmarking.org ms, Fewer Is Better toyBrot Fractal Generator 2020-11-18 Implementation: OpenMP 1 1a 2 3 70K 140K 210K 280K 350K SE +/- 15.70, N = 3 SE +/- 107.51, N = 3 SE +/- 36.20, N = 3 SE +/- 10.91, N = 3 311687 311825 311719 311678 1. (CXX) g++ options: -O3 -lpthread
toyBrot Fractal Generator Implementation: C++ Tasks OpenBenchmarking.org ms, Fewer Is Better toyBrot Fractal Generator 2020-11-18 Implementation: C++ Tasks 1 1a 2 3 70K 140K 210K 280K 350K SE +/- 12.17, N = 3 SE +/- 22.28, N = 3 SE +/- 50.05, N = 3 SE +/- 17.90, N = 3 311235 311270 311310 311303 1. (CXX) g++ options: -O3 -lpthread
toyBrot Fractal Generator Implementation: C++ Threads OpenBenchmarking.org ms, Fewer Is Better toyBrot Fractal Generator 2020-11-18 Implementation: C++ Threads 1 1a 2 3 70K 140K 210K 280K 350K SE +/- 31.00, N = 3 SE +/- 8.29, N = 3 SE +/- 16.09, N = 3 SE +/- 24.04, N = 3 311483 311462 311506 311510 1. (CXX) g++ options: -O3 -lpthread
CLOMP Static OMP Speedup OpenBenchmarking.org Speedup, More Is Better CLOMP 1.2 Static OMP Speedup 1 1a 2 3 0.2925 0.585 0.8775 1.17 1.4625 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.3 1.3 1.3 1.3 1. (CC) gcc options: -fopenmp -O3 -lm
Ngspice Circuit: C2670 OpenBenchmarking.org Seconds, Fewer Is Better Ngspice 34 Circuit: C2670 1 1a 2 3 80 160 240 320 400 SE +/- 2.49, N = 3 SE +/- 0.96, N = 3 SE +/- 1.05, N = 3 SE +/- 3.95, N = 3 377.90 371.84 371.91 374.35 1. (CC) gcc options: -O0 -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lSM -lICE
Ngspice Circuit: C7552 OpenBenchmarking.org Seconds, Fewer Is Better Ngspice 34 Circuit: C7552 1 1a 2 3 90 180 270 360 450 SE +/- 4.61, N = 3 SE +/- 0.93, N = 3 SE +/- 0.54, N = 3 SE +/- 0.91, N = 3 402.67 404.29 399.68 398.02 1. (CC) gcc options: -O0 -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lSM -lICE
Monkey Audio Encoding WAV To APE OpenBenchmarking.org Seconds, Fewer Is Better Monkey Audio Encoding 3.99.6 WAV To APE 1 1a 2 3 4 8 12 16 20 SE +/- 0.04, N = 5 SE +/- 0.04, N = 5 SE +/- 0.04, N = 5 SE +/- 0.04, N = 5 15.86 15.86 15.91 15.88 1. (CXX) g++ options: -O3 -pedantic -rdynamic -lrt
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.3.1 WAV To Opus Encode 1 1a 2 3 3 6 9 12 15 SE +/- 0.04, N = 5 SE +/- 0.03, N = 5 SE +/- 0.04, N = 5 SE +/- 0.03, N = 5 11.04 11.04 11.05 11.04 1. (CXX) g++ options: -fvisibility=hidden -logg -lm
WavPack Audio Encoding WAV To WavPack OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.3 WAV To WavPack 1a 2 3 4 8 12 16 20 SE +/- 0.04, N = 5 SE +/- 0.02, N = 5 SE +/- 0.04, N = 5 18.06 18.04 18.07 1. (CXX) g++ options: -rdynamic
Etcpak Configuration: DXT1 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: DXT1 1 1a 2 3 200 400 600 800 1000 SE +/- 4.61, N = 3 SE +/- 11.40, N = 3 SE +/- 2.99, N = 3 SE +/- 7.80, N = 3 1042.60 1033.12 1048.04 1040.96 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
Etcpak Configuration: ETC1 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC1 1 1a 2 3 50 100 150 200 250 SE +/- 0.07, N = 3 SE +/- 0.97, N = 3 SE +/- 0.01, N = 3 SE +/- 0.81, N = 3 230.81 229.89 230.99 229.53 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
Etcpak Configuration: ETC2 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC2 1 1a 2 3 30 60 90 120 150 SE +/- 0.30, N = 3 SE +/- 0.27, N = 3 SE +/- 0.01, N = 3 SE +/- 0.36, N = 3 135.58 135.66 135.99 135.31 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
Etcpak Configuration: ETC1 + Dithering OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC1 + Dithering 1 1a 2 3 50 100 150 200 250 SE +/- 3.09, N = 12 SE +/- 0.11, N = 3 SE +/- 0.03, N = 3 SE +/- 0.21, N = 3 216.24 219.34 219.44 217.80 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
JPEG XL Input: PNG - Encode Speed: 5 OpenBenchmarking.org MP/s, More Is Better JPEG XL 0.3.1 Input: PNG - Encode Speed: 5 1 1a 2 3 4 8 12 16 20 SE +/- 0.07, N = 3 SE +/- 0.06, N = 3 SE +/- 0.03, N = 3 SE +/- 0.08, N = 3 17.85 17.78 17.80 17.83 1. (CXX) g++ options: -funwind-tables -O3 -O2 -pthread -lbrotlicommon -lbrotlienc -lbrotlidec -ldl
JPEG XL Input: JPEG - Encode Speed: 5 OpenBenchmarking.org MP/s, More Is Better JPEG XL 0.3.1 Input: JPEG - Encode Speed: 5 1 1a 2 3 6 12 18 24 30 SE +/- 0.20, N = 3 SE +/- 0.15, N = 3 SE +/- 0.12, N = 3 SE +/- 0.02, N = 3 27.55 27.20 27.52 27.51 1. (CXX) g++ options: -funwind-tables -O3 -O2 -pthread -lbrotlicommon -lbrotlienc -lbrotlidec -ldl
JPEG XL Input: JPEG - Encode Speed: 7 OpenBenchmarking.org MP/s, More Is Better JPEG XL 0.3.1 Input: JPEG - Encode Speed: 7 1 1a 2 3 6 12 18 24 30 SE +/- 0.03, N = 3 SE +/- 0.10, N = 3 SE +/- 0.08, N = 3 SE +/- 0.13, N = 3 27.52 27.10 27.33 27.39 1. (CXX) g++ options: -funwind-tables -O3 -O2 -pthread -lbrotlicommon -lbrotlienc -lbrotlidec -ldl
JPEG XL Input: PNG - Encode Speed: 7 OpenBenchmarking.org MP/s, More Is Better JPEG XL 0.3.1 Input: PNG - Encode Speed: 7 1a 2 3 0.36 0.72 1.08 1.44 1.8 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 5 1.57 1.57 1.60 1. (CXX) g++ options: -funwind-tables -O3 -O2 -pthread -lbrotlicommon -lbrotlienc -lbrotlidec -ldl
JPEG XL Decoding CPU Threads: 1 OpenBenchmarking.org MP/s, More Is Better JPEG XL Decoding 0.3.1 CPU Threads: 1 1 1a 2 3 5 10 15 20 25 SE +/- 0.06, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.10, N = 3 21.50 21.27 21.48 21.44
JPEG XL Decoding CPU Threads: All OpenBenchmarking.org MP/s, More Is Better JPEG XL Decoding 0.3.1 CPU Threads: All 1 1a 2 3 9 18 27 36 45 SE +/- 0.10, N = 3 SE +/- 0.22, N = 3 SE +/- 0.07, N = 3 SE +/- 0.07, N = 3 37.59 37.39 37.74 37.53
Ogg Audio Encoding WAV To Ogg OpenBenchmarking.org Seconds, Fewer Is Better Ogg Audio Encoding 1.3.4 WAV To Ogg 1 1a 2 3 7 14 21 28 35 SE +/- 0.20, N = 3 SE +/- 0.05, N = 3 SE +/- 0.05, N = 3 SE +/- 0.03, N = 3 28.50 28.30 28.26 28.28 1. (CC) gcc options: -O2 -ffast-math -fsigned-char
Google SynthMark Test: VoiceMark_100 OpenBenchmarking.org Voices, More Is Better Google SynthMark 20201109 Test: VoiceMark_100 1 1a 2 3 120 240 360 480 600 SE +/- 1.77, N = 3 SE +/- 1.34, N = 3 SE +/- 1.38, N = 3 SE +/- 0.74, N = 3 545.08 541.77 543.82 544.82 1. (CXX) g++ options: -lm -lpthread -std=c++11 -Ofast
Gcrypt Library OpenBenchmarking.org Seconds, Fewer Is Better Gcrypt Library 1.9 1 1a 2 3 60 120 180 240 300 SE +/- 1.42, N = 3 SE +/- 0.94, N = 3 SE +/- 0.95, N = 3 SE +/- 0.24, N = 3 280.53 280.19 282.35 280.70 1. (CC) gcc options: -O2 -fvisibility=hidden
CloverLeaf Lagrangian-Eulerian Hydrodynamics OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf Lagrangian-Eulerian Hydrodynamics 1 1a 2 3 120 240 360 480 600 SE +/- 4.73, N = 3 SE +/- 2.70, N = 3 SE +/- 1.90, N = 3 SE +/- 1.21, N = 3 554.66 548.58 544.00 553.07 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: SqueezeNetV1.0 1a 2 3 6 12 18 24 30 SE +/- 0.10, N = 3 SE +/- 0.30, N = 4 SE +/- 0.37, N = 3 23.31 21.82 22.15 MIN: 22.79 / MAX: 69.26 MIN: 20.97 / MAX: 65.5 MIN: 21.1 / MAX: 72.63 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: resnet-v2-50 1a 2 3 20 40 60 80 100 SE +/- 1.56, N = 3 SE +/- 1.55, N = 4 SE +/- 2.07, N = 3 109.59 111.05 111.43 MIN: 105.69 / MAX: 155.76 MIN: 106.53 / MAX: 162 MIN: 105.81 / MAX: 161.27 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: MobileNetV2_224 1a 2 3 3 6 9 12 15 SE +/- 0.09, N = 3 SE +/- 0.33, N = 4 SE +/- 0.32, N = 3 12.76 12.84 13.49 MIN: 11.75 / MAX: 51.95 MIN: 11.48 / MAX: 34.31 MIN: 11.91 / MAX: 41.51 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: mobilenet-v1-1.0 1a 2 3 4 8 12 16 20 SE +/- 0.04, N = 3 SE +/- 0.21, N = 4 SE +/- 0.49, N = 3 14.59 14.98 14.82 MIN: 11.71 / MAX: 17.52 MIN: 11.44 / MAX: 66.12 MIN: 11.51 / MAX: 50.09 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: inception-v3 1a 2 3 40 80 120 160 200 SE +/- 1.88, N = 3 SE +/- 2.35, N = 4 SE +/- 1.11, N = 3 172.91 172.57 176.04 MIN: 167.46 / MAX: 269.27 MIN: 164.23 / MAX: 293.82 MIN: 172.62 / MAX: 266.53 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
ONNX Runtime Model: yolov4 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: yolov4 - Device: OpenMP CPU 1a 2 3 20 40 60 80 100 SE +/- 0.44, N = 3 SE +/- 0.00, N = 3 SE +/- 0.17, N = 3 87 87 86 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
ONNX Runtime Model: bertsquad-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: bertsquad-10 - Device: OpenMP CPU 1a 2 3 20 40 60 80 100 SE +/- 0.90, N = 12 SE +/- 0.76, N = 3 SE +/- 0.73, N = 3 104 106 103 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: fcn-resnet101-11 - Device: OpenMP CPU 1a 2 3 4 8 12 16 20 SE +/- 0.17, N = 3 15 15 15 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
ONNX Runtime Model: shufflenet-v2-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: shufflenet-v2-10 - Device: OpenMP CPU 1a 2 3 1300 2600 3900 5200 6500 SE +/- 16.69, N = 3 SE +/- 25.85, N = 3 SE +/- 24.30, N = 3 5785 5836 5797 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: super-resolution-10 - Device: OpenMP CPU 1a 2 3 200 400 600 800 1000 SE +/- 7.07, N = 3 SE +/- 3.19, N = 3 SE +/- 2.03, N = 3 761 783 770 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: MobileNet v2 1a 2 3 80 160 240 320 400 SE +/- 1.44, N = 3 SE +/- 1.58, N = 3 SE +/- 2.96, N = 3 364.40 361.52 366.28 MIN: 361.37 / MAX: 372.22 MIN: 358.11 / MAX: 407.59 MIN: 360.84 / MAX: 373.56 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: SqueezeNet v1.1 1a 2 3 80 160 240 320 400 SE +/- 0.98, N = 3 SE +/- 0.63, N = 3 SE +/- 0.76, N = 3 375.89 377.54 380.15 MIN: 373.57 / MAX: 379.55 MIN: 372.49 / MAX: 379.75 MIN: 375.68 / MAX: 388.43 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
GROMACS Input: water_GMX50_bare OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2021 Input: water_GMX50_bare 1a 2 3 0.0394 0.0788 0.1182 0.1576 0.197 SE +/- 0.002, N = 4 SE +/- 0.002, N = 3 SE +/- 0.000, N = 3 0.167 0.175 0.166 1. (CXX) g++ options: -O3 -pthread
LAMMPS Molecular Dynamics Simulator Model: Rhodopsin Protein OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: Rhodopsin Protein 1 1a 2 3 0.2637 0.5274 0.7911 1.0548 1.3185 SE +/- 0.005, N = 3 SE +/- 0.002, N = 3 SE +/- 0.006, N = 3 SE +/- 0.009, N = 3 1.164 1.162 1.172 1.159 1. (CXX) g++ options: -O3 -pthread -lm
ASKAP Test: tConvolve MT - Gridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve MT - Gridding 1a 2 3 80 160 240 320 400 SE +/- 3.33, N = 3 SE +/- 2.84, N = 14 SE +/- 1.69, N = 3 374.15 367.93 368.79 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASKAP Test: tConvolve MT - Degridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve MT - Degridding 1a 2 3 90 180 270 360 450 SE +/- 6.28, N = 3 SE +/- 3.30, N = 14 SE +/- 4.84, N = 3 434.17 426.57 434.40 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASKAP Test: tConvolve MPI - Degridding OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Degridding 1a 2 3 50 100 150 200 250 SE +/- 2.49, N = 8 SE +/- 2.13, N = 15 SE +/- 3.46, N = 4 239.59 240.38 239.12 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASKAP Test: tConvolve MPI - Gridding OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Gridding 1a 2 3 70 140 210 280 350 SE +/- 2.70, N = 8 SE +/- 2.47, N = 15 SE +/- 6.26, N = 4 323.90 325.87 325.10 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASKAP Test: tConvolve OpenMP - Gridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve OpenMP - Gridding 1a 2 3 70 140 210 280 350 SE +/- 3.04, N = 11 SE +/- 2.45, N = 3 SE +/- 1.95, N = 3 338.63 340.23 333.54 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASKAP Test: tConvolve OpenMP - Degridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve OpenMP - Degridding 1a 2 3 90 180 270 360 450 SE +/- 4.75, N = 11 SE +/- 5.32, N = 3 SE +/- 6.03, N = 3 431.55 437.33 436.17 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASKAP Test: Hogbom Clean OpenMP OpenBenchmarking.org Iterations Per Second, More Is Better ASKAP 1.0 Test: Hogbom Clean OpenMP 1a 2 3 14 28 42 56 70 SE +/- 0.73, N = 3 SE +/- 0.86, N = 3 SE +/- 0.20, N = 3 59.14 59.70 60.32 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
LULESH OpenBenchmarking.org z/s, More Is Better LULESH 2.0.3 1 1a 2 3 110 220 330 440 550 SE +/- 12.69, N = 12 SE +/- 12.59, N = 15 SE +/- 6.26, N = 15 SE +/- 5.89, N = 15 441.30 519.95 445.62 461.70 1. (CXX) g++ options: -O3 -fopenmp -lm -pthread -lmpi_cxx -lmpi
QMCPACK Input: simple-H2O OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 3.10 Input: simple-H2O 1 1a 2 3 9 18 27 36 45 SE +/- 0.28, N = 3 SE +/- 0.05, N = 3 SE +/- 0.32, N = 3 SE +/- 0.27, N = 3 38.37 38.43 38.87 38.76 1. (CXX) g++ options: -fopenmp -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -march=native -O3 -fomit-frame-pointer -ffast-math -lm -pthread
dav1d Video Input: Chimera 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Chimera 1080p 1 1a 2 3 30 60 90 120 150 SE +/- 1.10, N = 3 SE +/- 0.45, N = 3 SE +/- 0.29, N = 3 SE +/- 0.80, N = 3 119.69 121.64 120.37 120.89 MIN: 89.84 / MAX: 200.59 MIN: 91.46 / MAX: 204.62 MIN: 91.11 / MAX: 201.92 MIN: 90.45 / MAX: 200.97 1. (CC) gcc options: -pthread
dav1d Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Summer Nature 4K 1 1a 2 3 7 14 21 28 35 SE +/- 0.07, N = 3 SE +/- 0.02, N = 3 SE +/- 0.08, N = 3 SE +/- 0.03, N = 3 30.04 30.19 30.09 30.17 MIN: 28.15 / MAX: 34.03 MIN: 28.35 / MAX: 34.12 MIN: 28.15 / MAX: 34.13 MIN: 28.31 / MAX: 34.14 1. (CC) gcc options: -pthread
dav1d Video Input: Summer Nature 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Summer Nature 1080p 1 1a 2 3 30 60 90 120 150 SE +/- 1.00, N = 3 SE +/- 0.32, N = 3 SE +/- 0.92, N = 3 SE +/- 0.19, N = 3 110.16 111.99 111.00 112.88 MIN: 102.8 / MAX: 121.81 MIN: 105.14 / MAX: 122.07 MIN: 103.18 / MAX: 121.5 MIN: 105.91 / MAX: 122.73 1. (CC) gcc options: -pthread
dav1d Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Chimera 1080p 10-bit 1 1a 2 3 7 14 21 28 35 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 SE +/- 0.18, N = 3 SE +/- 0.02, N = 3 29.61 29.72 29.51 29.67 MIN: 20.51 / MAX: 65.19 MIN: 20.56 / MAX: 64.94 MIN: 20.24 / MAX: 64.75 MIN: 20.55 / MAX: 64.32 1. (CC) gcc options: -pthread
Timed Eigen Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Eigen Compilation 3.3.9 Time To Compile 1 1a 2 3 30 60 90 120 150 SE +/- 1.55, N = 3 SE +/- 0.20, N = 3 SE +/- 0.43, N = 3 SE +/- 0.22, N = 3 130.56 127.82 127.77 127.82
Redis Test: LPOP OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: LPOP 1a 2 3 400K 800K 1200K 1600K 2000K SE +/- 19471.30, N = 15 SE +/- 9854.44, N = 3 SE +/- 2036.74, N = 3 1733646.58 1163471.21 1170246.42 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Redis Test: SADD OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SADD 1a 2 3 300K 600K 900K 1200K 1500K SE +/- 1833.47, N = 3 SE +/- 5866.11, N = 3 SE +/- 5061.24, N = 3 1480929.42 1471883.58 1443346.96 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Redis Test: LPUSH OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: LPUSH 1a 2 3 200K 400K 600K 800K 1000K SE +/- 7627.94, N = 3 SE +/- 8154.95, N = 3 SE +/- 5070.14, N = 3 1138733.25 1147588.38 1132436.75 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Redis Test: GET OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: GET 1a 2 3 400K 800K 1200K 1600K 2000K SE +/- 13707.51, N = 3 SE +/- 20469.12, N = 12 SE +/- 13031.14, N = 3 1696677.54 1553775.24 1579594.62 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Redis Test: SET OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SET 1a 2 3 300K 600K 900K 1200K 1500K SE +/- 16021.57, N = 5 SE +/- 4253.68, N = 3 SE +/- 15879.88, N = 3 1319523.38 1320791.37 1330886.96 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
GnuPG 2.7GB Sample File Encryption OpenBenchmarking.org Seconds, Fewer Is Better GnuPG 2.2.27 2.7GB Sample File Encryption 1a 2 3 20 40 60 80 100 SE +/- 0.13, N = 3 SE +/- 0.09, N = 3 SE +/- 0.03, N = 3 85.51 84.95 84.80 1. (CC) gcc options: -O2
Unpacking Firefox Extracting: firefox-84.0.source.tar.xz OpenBenchmarking.org Seconds, Fewer Is Better Unpacking Firefox 84.0 Extracting: firefox-84.0.source.tar.xz 1a 2 3 7 14 21 28 35 SE +/- 0.08, N = 4 SE +/- 0.31, N = 4 SE +/- 0.12, N = 4 27.03 27.70 27.52
QuantLib OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.21 1 1a 2 3 300 600 900 1200 1500 SE +/- 8.09, N = 3 SE +/- 6.72, N = 3 SE +/- 7.79, N = 3 SE +/- 7.39, N = 3 1579.6 1573.9 1579.7 1578.7 1. (CXX) g++ options: -O3 -march=native -rdynamic -lboost_timer -lboost_system -lboost_chrono
Phoronix Test Suite v10.8.4