Haswell 2021 Intel Xeon E5-2687W v3 testing with a MSI X99S SLI PLUS (MS-7885) v1.0 (1.E0 BIOS) and NVIDIA GeForce GTX 770 on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2101287-HA-HASWELL2015&grs&sor .
Haswell 2021 Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver Compiler File-System Screen Resolution 1 2 3 Intel Xeon E5-2687W v3 @ 3.50GHz (10 Cores / 20 Threads) MSI X99S SLI PLUS (MS-7885) v1.0 (1.E0 BIOS) Intel Xeon E7 v3/Xeon 32GB 80GB INTEL SSDSCKGW08 NVIDIA GeForce GTX 770 Realtek ALC892 LG Ultra HD Intel I218-V Ubuntu 20.04 5.9.0-050900rc7daily20200928-generic (x86_64) 20200927 GNOME Shell 3.36.4 X Server 1.20.8 modesetting 1.20.8 GCC 9.3.0 ext4 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_cpufreq ondemand - CPU Microcode: 0x44 Python Details - Python 3.8.5 Security Details - itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT vulnerable + mds: Mitigation of Clear buffers; SMT vulnerable + meltdown: Mitigation of PTI + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Haswell 2021 redis: GET npb: EP.C ncnn: CPU - yolov4-tiny cpuminer-opt: Skeincoin cloverleaf: Lagrangian-Eulerian Hydrodynamics financebench: Repo OpenMP onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU cpuminer-opt: Blake-2 S ncnn: CPU - resnet18 financebench: Bonds OpenMP cp2k: Fayalite-FIST Data ncnn: CPU - googlenet redis: SET askap: tConvolve OpenMP - Degridding cpuminer-opt: Myriad-Groestl lzbench: Zstd 8 - Decompression npb: EP.D encode-opus: WAV To Opus Encode mnn: MobileNetV2_224 webp2: Default kripke: qmcpack: simple-H2O ncnn: CPU - blazeface ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU - regnety_400m mnn: mobilenet-v1-1.0 dav1d: Summer Nature 1080p gnupg: 2.7GB Sample File Encryption build-eigen: Time To Compile dav1d: Summer Nature 4K cpuminer-opt: Magi redis: SADD mnn: resnet-v2-50 perf-bench: Sched Pipe unpack-firefox: firefox-84.0.source.tar.xz ncnn: CPU - mobilenet ncnn: CPU - resnet50 cpuminer-opt: Garlicoin cryptsetup: AES-XTS 256b Decryption amg: onnx: bertsquad-10 - OpenMP CPU ncnn: CPU - mnasnet cryptsetup: AES-XTS 256b Encryption lulesh: cpuminer-opt: Ringcoin lzbench: Brotli 2 - Decompression build2: Time To Compile askap: tConvolve MT - Degridding redis: LPUSH ncnn: CPU - efficientnet-b0 lzbench: Brotli 0 - Decompression rav1e: 10 onednn: Recurrent Neural Network Inference - u8s8f32 - CPU tnn: CPU - SqueezeNet v1.1 cpuminer-opt: LBC, LBRY Credits ncnn: CPU - shufflenet-v2 onednn: Recurrent Neural Network Inference - f32 - CPU cpuminer-opt: x25x onednn: IP Shapes 3D - f32 - CPU cryptsetup: Twofish-XTS 256b Encryption cryptsetup: PBKDF2-whirlpool tnn: CPU - MobileNet v2 onnx: shufflenet-v2-10 - OpenMP CPU rav1e: 6 onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU cryptsetup: Serpent-XTS 256b Encryption onednn: IP Shapes 3D - u8s8f32 - CPU encode-wavpack: WAV To WavPack gcrypt: dav1d: Chimera 1080p 10-bit etcpak: ETC1 cryptsetup: Serpent-XTS 256b Decryption etcpak: ETC2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU - vgg16 rav1e: 1 etcpak: DXT1 cryptsetup: AES-XTS 512b Decryption ncnn: CPU - alexnet askap: tConvolve MT - Gridding cryptsetup: Twofish-XTS 512b Decryption cryptsetup: Twofish-XTS 256b Decryption onnx: yolov4 - OpenMP CPU lzbench: Zstd 1 - Decompression cryptsetup: AES-XTS 512b Encryption onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU lzbench: Brotli 0 - Compression onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU dav1d: Chimera 1080p mnn: SqueezeNetV1.0 lzbench: Zstd 1 - Compression rav1e: 5 webp2: Quality 95, Compression Effort 7 openfoam: Motorbike 30M lzbench: Crush 0 - Decompression synthmark: VoiceMark_100 onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU webp2: Quality 75, Compression Effort 7 cryptsetup: Serpent-XTS 512b Decryption onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU cryptsetup: Serpent-XTS 512b Encryption mnn: inception-v3 etcpak: ETC1 + Dithering askap: Hogbom Clean OpenMP build-godot: Time To Compile ncnn: CPU - squeezenet_ssd webp2: Quality 100, Lossless Compression onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: IP Shapes 1D - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU quantlib: lzbench: Libdeflate 1 - Decompression cryptsetup: PBKDF2-sha512 onnx: super-resolution-10 - OpenMP CPU onednn: Recurrent Neural Network Training - f32 - CPU webp2: Quality 100, Compression Effort 5 onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU cryptsetup: Twofish-XTS 512b Encryption onnx: fcn-resnet101-11 - OpenMP CPU lzbench: Libdeflate 1 - Compression lzbench: Brotli 2 - Compression lzbench: Crush 0 - Compression lzbench: Zstd 8 - Compression lzbench: XZ 0 - Decompression lzbench: XZ 0 - Compression redis: LPOP askap: tConvolve OpenMP - Gridding askap: tConvolve MPI - Gridding askap: tConvolve MPI - Degridding cpuminer-opt: Triple SHA-256, Onecoin cpuminer-opt: Quad SHA-256, Pyrite cpuminer-opt: Deepcoin lammps: Rhodopsin Protein clomp: Static OMP Speedup 1 2 3 1887226.87 1050.17 31.43 34847 136.89 68017.653646 6.66011 220508 15.20 115661.129687 1543.683 16.60 1415197.08 2664.35 10363 1423 1027.88 10.686 5.061 5.677 46104420 51.153 2.76 6.22 24.08 6.196 382.21 81.081 109.859 142.04 203.98 1602736.88 54.511 73755 24.038 20.03 29.06 1736.89 1697.2 302891133 545 5.52 1702.3 4654.4664 1579.48 581 159.687 1756.70 1247740.00 8.28 503 2.375 2208.25 329.721 23330 7.05 2209.39 219.58 7.41691 344.4 530142 339.212 9736 1.065 3.00389 549.0 2.48994 17.395 273.551 68.44 235.540 532.9 140.731 5.41 51.78 0.277 1083.861 1389.6 12.08 1285.02 346.4 346.8 328 1390 1400.2 4051.99 356 3.36850 2.83074 459.11 8.447 401 0.810 545.956 219.05 431 551.181 8.39325 5.76536 298.918 533.7 12.7075 550.7 55.865 226.752 239.235 186.697 26.55 939.760 5.07564 13.7854 4.07476 4045.79 1689.6 995 1328993 4007 4048.95 14.482 2211.56 345.6 59 182 148 76 65 96 34 2010121.55 1778.10 2071.46 1699.85 61247 47653 7487.98 4.415 13.2 1775802.50 1002.01 31.96 34338 140.64 69828.361979 6.51937 221938 15.54 114077.565104 1576.952 16.84 1408307.62 2716.9 10567 1413 1008.91 10.617 5.146 5.664 46529853 51.903 2.79 6.22 23.93 6.112 380.86 80.221 111.128 140.67 204.18 1588662.75 53.966 73695 23.891 20.21 28.86 1750.88 1702.6 303612367 541 5.55 1708.5 4644.4601 1582.49 585 160.674 1753.61 1242294.79 8.30 506 2.361 2211.70 328.625 23407 7.07 2214.44 220.73 7.39027 345.4 532814 337.511 9775 1.070 2.99080 550.3 2.47930 17.322 272.411 68.62 235.968 531.5 140.315 5.41 51.97 0.277 1080.546 1393.0 12.12 1284.71 346.8 345.8 328 1389 1401.7 4040.59 356 3.35947 2.83861 459.32 8.439 401 0.810 545.663 219.10 432 550.142 8.40124 5.77823 298.253 533.3 12.6938 551.2 55.917 227.136 239.425 186.844 26.57 938.916 5.07714 13.7985 4.07451 4044.15 1691.5 994 1330118 4008 4050.54 14.478 2211.32 345.7 59 182 148 76 65 96 34 1271416.38 1881.94 1984.73 1589.75 60350 48622 7013.87 4.261 11.0 1765664.37 1047.67 32.73 35707 140.57 69374.869141 6.68411 226073 15.19 116695.378255 1546.678 16.95 1436898.96 2716.9 10510 1396 1019.37 10.796 5.100 5.756 46840427 51.913 2.75 6.31 23.75 6.151 385.99 80.127 111.065 140.47 206.24 1605411.58 54.124 74430 24.127 20.06 29.11 1738.45 1710.6 305133633 542 5.56 1714.4 4621.6305 1590.60 582 160.093 1745.94 1249824.00 8.33 505 2.375 2221.13 327.826 23463 7.03 2221.93 220.82 7.43079 346.2 532634 338.741 9729 1.068 2.99407 551.4 2.48125 17.326 273.013 68.34 236.454 533.5 140.210 5.43 51.85 0.276 1084.329 1394.3 12.11 1280.90 345.7 346.9 327 1386 1404.2 4048.11 357 3.36883 2.83373 458.06 8.424 400 0.808 544.662 219.57 431 549.935 8.38226 5.76890 298.540 532.6 12.7193 551.7 55.960 227.056 239.617 186.557 26.59 940.231 5.07005 13.7804 4.07980 4040.61 1691.6 995 1328993 4005 4051.42 14.475 2210.61 345.6 59 182 148 76 65 96 34 1758555.23 1962.74 1979.00 1597.84 62923 48341 7312.24 4.463 13.0 OpenBenchmarking.org
Redis Test: GET OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: GET 1 2 3 400K 800K 1200K 1600K 2000K SE +/- 6745.04, N = 3 SE +/- 25400.58, N = 3 SE +/- 20496.02, N = 3 1887226.87 1775802.50 1765664.37 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
NAS Parallel Benchmarks Test / Class: EP.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.C 1 3 2 200 400 600 800 1000 SE +/- 16.48, N = 3 SE +/- 4.68, N = 3 SE +/- 11.00, N = 15 1050.17 1047.67 1002.01 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. Open MPI 4.0.3
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: yolov4-tiny 1 2 3 8 16 24 32 40 SE +/- 0.35, N = 3 SE +/- 0.60, N = 3 SE +/- 0.34, N = 3 31.43 31.96 32.73 MIN: 30.14 / MAX: 34.58 MIN: 30.06 / MAX: 35.42 MIN: 31.73 / MAX: 35.09 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Cpuminer-Opt Algorithm: Skeincoin OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.15.5 Algorithm: Skeincoin 3 1 2 8K 16K 24K 32K 40K SE +/- 140.75, N = 3 SE +/- 291.68, N = 15 SE +/- 325.79, N = 15 35707 34847 34338 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
CloverLeaf Lagrangian-Eulerian Hydrodynamics OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf Lagrangian-Eulerian Hydrodynamics 1 3 2 30 60 90 120 150 SE +/- 0.22, N = 3 SE +/- 0.33, N = 3 SE +/- 0.54, N = 3 136.89 140.57 140.64 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
FinanceBench Benchmark: Repo OpenMP OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-07-25 Benchmark: Repo OpenMP 1 3 2 15K 30K 45K 60K 75K SE +/- 89.04, N = 3 SE +/- 922.31, N = 4 SE +/- 1002.67, N = 3 68017.65 69374.87 69828.36 1. (CXX) g++ options: -O3 -march=native -fopenmp
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU 2 1 3 2 4 6 8 10 SE +/- 0.07423, N = 6 SE +/- 0.08426, N = 4 SE +/- 0.09722, N = 3 6.51937 6.66011 6.68411 MIN: 6.3 MIN: 6.33 MIN: 6.42 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Cpuminer-Opt Algorithm: Blake-2 S OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.15.5 Algorithm: Blake-2 S 3 2 1 50K 100K 150K 200K 250K SE +/- 3030.09, N = 4 SE +/- 2626.85, N = 15 SE +/- 3265.93, N = 4 226073 221938 220508 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet18 3 1 2 4 8 12 16 20 SE +/- 0.06, N = 3 SE +/- 0.04, N = 3 SE +/- 0.15, N = 3 15.19 15.20 15.54 MIN: 14.96 / MAX: 16.48 MIN: 15.01 / MAX: 15.36 MIN: 15 / MAX: 110.46 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
FinanceBench Benchmark: Bonds OpenMP OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-07-25 Benchmark: Bonds OpenMP 2 1 3 20K 40K 60K 80K 100K SE +/- 20.05, N = 3 SE +/- 1477.66, N = 5 SE +/- 1452.56, N = 12 114077.57 115661.13 116695.38 1. (CXX) g++ options: -O3 -march=native -fopenmp
CP2K Molecular Dynamics Fayalite-FIST Data OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 8.1 Fayalite-FIST Data 1 3 2 300 600 900 1200 1500 1543.68 1546.68 1576.95
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: googlenet 1 2 3 4 8 12 16 20 SE +/- 0.14, N = 3 SE +/- 0.20, N = 3 SE +/- 0.31, N = 3 16.60 16.84 16.95 MIN: 16.2 / MAX: 74.59 MIN: 16.12 / MAX: 18.21 MIN: 16.16 / MAX: 17.97 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Redis Test: SET OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SET 3 1 2 300K 600K 900K 1200K 1500K SE +/- 9026.88, N = 3 SE +/- 14514.92, N = 8 SE +/- 19369.44, N = 15 1436898.96 1415197.08 1408307.62 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
ASKAP Test: tConvolve OpenMP - Degridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve OpenMP - Degridding 3 2 1 600 1200 1800 2400 3000 SE +/- 0.00, N = 3 SE +/- 0.00, N = 15 SE +/- 1.79, N = 15 2716.90 2716.90 2664.35 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
Cpuminer-Opt Algorithm: Myriad-Groestl OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.15.5 Algorithm: Myriad-Groestl 2 3 1 2K 4K 6K 8K 10K SE +/- 161.69, N = 3 SE +/- 125.83, N = 3 SE +/- 6.67, N = 3 10567 10510 10363 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
lzbench Test: Zstd 8 - Process: Decompression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Zstd 8 - Process: Decompression 1 2 3 300 600 900 1200 1500 SE +/- 4.06, N = 3 SE +/- 7.88, N = 3 SE +/- 21.06, N = 3 1423 1413 1396 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
NAS Parallel Benchmarks Test / Class: EP.D OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.D 1 3 2 200 400 600 800 1000 SE +/- 14.12, N = 4 SE +/- 17.23, N = 3 SE +/- 9.01, N = 12 1027.88 1019.37 1008.91 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. Open MPI 4.0.3
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.3.1 WAV To Opus Encode 2 1 3 3 6 9 12 15 SE +/- 0.04, N = 5 SE +/- 0.03, N = 5 SE +/- 0.03, N = 5 10.62 10.69 10.80 1. (CXX) g++ options: -fvisibility=hidden -logg -lm
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: MobileNetV2_224 1 3 2 1.1579 2.3158 3.4737 4.6316 5.7895 SE +/- 0.016, N = 3 SE +/- 0.037, N = 3 SE +/- 0.008, N = 3 5.061 5.100 5.146 MIN: 4.98 / MAX: 5.99 MIN: 4.42 / MAX: 11.39 MIN: 5.03 / MAX: 6.03 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
WebP2 Image Encode Encode Settings: Default OpenBenchmarking.org Seconds, Fewer Is Better WebP2 Image Encode 20210126 Encode Settings: Default 2 1 3 1.2951 2.5902 3.8853 5.1804 6.4755 SE +/- 0.018, N = 3 SE +/- 0.037, N = 3 SE +/- 0.043, N = 3 5.664 5.677 5.756 1. (CXX) g++ options: -msse4.2 -fno-rtti -O3 -rdynamic -lpthread -ljpeg
Kripke OpenBenchmarking.org Throughput FoM, More Is Better Kripke 1.2.4 3 2 1 10M 20M 30M 40M 50M SE +/- 186245.67, N = 3 SE +/- 199447.85, N = 3 SE +/- 200225.59, N = 3 46840427 46529853 46104420 1. (CXX) g++ options: -O3 -fopenmp
QMCPACK Input: simple-H2O OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 3.10 Input: simple-H2O 1 2 3 12 24 36 48 60 SE +/- 0.64, N = 5 SE +/- 0.70, N = 5 SE +/- 0.74, N = 3 51.15 51.90 51.91 1. (CXX) g++ options: -fopenmp -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -march=native -O3 -fomit-frame-pointer -ffast-math -pthread -lm
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: blazeface 3 1 2 0.6278 1.2556 1.8834 2.5112 3.139 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 2.75 2.76 2.79 MIN: 2.69 / MAX: 3.11 MIN: 2.7 / MAX: 3.13 MIN: 2.69 / MAX: 3.4 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v2-v2 - Model: mobilenet-v2 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.10, N = 3 6.22 6.22 6.31 MIN: 6.14 / MAX: 6.38 MIN: 6.12 / MAX: 6.83 MIN: 6.11 / MAX: 64.27 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: regnety_400m 3 2 1 6 12 18 24 30 SE +/- 0.06, N = 3 SE +/- 0.13, N = 3 SE +/- 0.03, N = 3 23.75 23.93 24.08 MIN: 23.54 / MAX: 24.43 MIN: 23.59 / MAX: 24.31 MIN: 23.89 / MAX: 25.07 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: mobilenet-v1-1.0 2 3 1 2 4 6 8 10 SE +/- 0.007, N = 3 SE +/- 0.018, N = 3 SE +/- 0.006, N = 3 6.112 6.151 6.196 MIN: 6.03 / MAX: 12.61 MIN: 6.05 / MAX: 6.9 MIN: 4.66 / MAX: 19.35 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
dav1d Video Input: Summer Nature 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Summer Nature 1080p 3 1 2 80 160 240 320 400 SE +/- 1.19, N = 3 SE +/- 3.01, N = 3 SE +/- 3.16, N = 3 385.99 382.21 380.86 MIN: 312.93 / MAX: 423.3 MIN: 305.4 / MAX: 420.06 MIN: 302.45 / MAX: 418.3 1. (CC) gcc options: -pthread
GnuPG 2.7GB Sample File Encryption OpenBenchmarking.org Seconds, Fewer Is Better GnuPG 2.2.27 2.7GB Sample File Encryption 3 2 1 20 40 60 80 100 SE +/- 0.07, N = 3 SE +/- 0.16, N = 3 SE +/- 1.03, N = 3 80.13 80.22 81.08 1. (CC) gcc options: -O2
Timed Eigen Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Eigen Compilation 3.3.9 Time To Compile 1 3 2 20 40 60 80 100 SE +/- 0.16, N = 3 SE +/- 0.15, N = 3 SE +/- 0.09, N = 3 109.86 111.07 111.13
dav1d Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Summer Nature 4K 1 2 3 30 60 90 120 150 SE +/- 0.27, N = 3 SE +/- 0.53, N = 3 SE +/- 0.14, N = 3 142.04 140.67 140.47 MIN: 130.3 / MAX: 163.29 MIN: 125.57 / MAX: 161.64 MIN: 130.58 / MAX: 160.85 1. (CC) gcc options: -pthread
Cpuminer-Opt Algorithm: Magi OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.15.5 Algorithm: Magi 3 2 1 50 100 150 200 250 SE +/- 2.29, N = 14 SE +/- 0.17, N = 3 SE +/- 0.12, N = 3 206.24 204.18 203.98 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
Redis Test: SADD OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SADD 3 1 2 300K 600K 900K 1200K 1500K SE +/- 7906.68, N = 3 SE +/- 21115.79, N = 4 SE +/- 10942.24, N = 3 1605411.58 1602736.88 1588662.75 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: resnet-v2-50 2 3 1 12 24 36 48 60 SE +/- 0.06, N = 3 SE +/- 0.10, N = 3 SE +/- 0.07, N = 3 53.97 54.12 54.51 MIN: 53.45 / MAX: 128.98 MIN: 53.84 / MAX: 125.64 MIN: 53.53 / MAX: 130.53 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
perf-bench Benchmark: Sched Pipe OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Sched Pipe 3 1 2 16K 32K 48K 64K 80K SE +/- 629.67, N = 3 SE +/- 327.32, N = 3 SE +/- 929.56, N = 5 74430 73755 73695 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lz -llzma -lnuma
Unpacking Firefox Extracting: firefox-84.0.source.tar.xz OpenBenchmarking.org Seconds, Fewer Is Better Unpacking Firefox 84.0 Extracting: firefox-84.0.source.tar.xz 2 1 3 6 12 18 24 30 SE +/- 0.07, N = 4 SE +/- 0.17, N = 4 SE +/- 0.10, N = 4 23.89 24.04 24.13
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mobilenet 1 3 2 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.14, N = 3 20.03 20.06 20.21 MIN: 19.87 / MAX: 21.34 MIN: 19.95 / MAX: 21.79 MIN: 19.81 / MAX: 21.32 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet50 2 1 3 7 14 21 28 35 SE +/- 0.10, N = 3 SE +/- 0.25, N = 3 SE +/- 0.09, N = 3 28.86 29.06 29.11 MIN: 28.14 / MAX: 29.56 MIN: 28.13 / MAX: 134.5 MIN: 28.24 / MAX: 30.15 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Cpuminer-Opt Algorithm: Garlicoin OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.15.5 Algorithm: Garlicoin 2 3 1 400 800 1200 1600 2000 SE +/- 6.73, N = 3 SE +/- 4.72, N = 3 SE +/- 7.48, N = 3 1750.88 1738.45 1736.89 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
Cryptsetup AES-XTS 256b Decryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup AES-XTS 256b Decryption 3 2 1 400 800 1200 1600 2000 SE +/- 1.95, N = 3 SE +/- 1.84, N = 3 SE +/- 8.30, N = 3 1710.6 1702.6 1697.2
Algebraic Multi-Grid Benchmark OpenBenchmarking.org Figure Of Merit, More Is Better Algebraic Multi-Grid Benchmark 1.2 3 2 1 70M 140M 210M 280M 350M SE +/- 980611.02, N = 3 SE +/- 2849594.82, N = 3 SE +/- 2829162.69, N = 3 305133633 303612367 302891133 1. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -pthread -lmpi
ONNX Runtime Model: bertsquad-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: bertsquad-10 - Device: OpenMP CPU 1 3 2 120 240 360 480 600 SE +/- 0.87, N = 3 SE +/- 0.83, N = 3 SE +/- 1.36, N = 3 545 542 541 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mnasnet 1 2 3 1.251 2.502 3.753 5.004 6.255 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.07, N = 3 5.52 5.55 5.56 MIN: 5.41 / MAX: 5.78 MIN: 5.4 / MAX: 5.65 MIN: 5.41 / MAX: 5.79 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Cryptsetup AES-XTS 256b Encryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup AES-XTS 256b Encryption 3 2 1 400 800 1200 1600 2000 SE +/- 2.43, N = 3 SE +/- 3.81, N = 3 SE +/- 6.83, N = 3 1714.4 1708.5 1702.3
LULESH OpenBenchmarking.org z/s, More Is Better LULESH 2.0.3 1 2 3 1000 2000 3000 4000 5000 SE +/- 37.60, N = 3 SE +/- 60.83, N = 3 SE +/- 60.75, N = 3 4654.47 4644.46 4621.63 1. (CXX) g++ options: -O3 -fopenmp -lm -pthread -lmpi_cxx -lmpi
Cpuminer-Opt Algorithm: Ringcoin OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.15.5 Algorithm: Ringcoin 3 2 1 300 600 900 1200 1500 SE +/- 11.74, N = 14 SE +/- 10.01, N = 3 SE +/- 3.43, N = 3 1590.60 1582.49 1579.48 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
lzbench Test: Brotli 2 - Process: Decompression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Brotli 2 - Process: Decompression 2 3 1 130 260 390 520 650 SE +/- 0.33, N = 3 SE +/- 2.73, N = 3 SE +/- 5.17, N = 3 585 582 581 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.13 Time To Compile 1 3 2 40 80 120 160 200 SE +/- 0.55, N = 3 SE +/- 0.91, N = 3 SE +/- 0.85, N = 3 159.69 160.09 160.67
ASKAP Test: tConvolve MT - Degridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve MT - Degridding 1 2 3 400 800 1200 1600 2000 SE +/- 0.84, N = 3 SE +/- 0.77, N = 3 SE +/- 0.99, N = 3 1756.70 1753.61 1745.94 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
Redis Test: LPUSH OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: LPUSH 3 1 2 300K 600K 900K 1200K 1500K SE +/- 2768.29, N = 3 SE +/- 3115.55, N = 3 SE +/- 9099.06, N = 3 1249824.00 1247740.00 1242294.79 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: efficientnet-b0 1 2 3 2 4 6 8 10 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 8.28 8.30 8.33 MIN: 8.18 / MAX: 8.87 MIN: 8.14 / MAX: 8.79 MIN: 8.16 / MAX: 9.17 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
lzbench Test: Brotli 0 - Process: Decompression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Brotli 0 - Process: Decompression 2 3 1 110 220 330 440 550 SE +/- 0.67, N = 3 SE +/- 3.00, N = 3 506 505 503 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
rav1e Speed: 10 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 10 3 1 2 0.5344 1.0688 1.6032 2.1376 2.672 SE +/- 0.013, N = 3 SE +/- 0.011, N = 3 SE +/- 0.026, N = 3 2.375 2.375 2.361
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 1 2 3 500 1000 1500 2000 2500 SE +/- 0.62, N = 3 SE +/- 1.77, N = 3 SE +/- 9.79, N = 3 2208.25 2211.70 2221.13 MIN: 2204.85 MIN: 2206.98 MIN: 2207.17 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: SqueezeNet v1.1 3 2 1 70 140 210 280 350 SE +/- 0.61, N = 3 SE +/- 0.16, N = 3 SE +/- 0.32, N = 3 327.83 328.63 329.72 MIN: 326.73 / MAX: 330.31 MIN: 328.14 / MAX: 330.14 MIN: 329.09 / MAX: 331.08 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
Cpuminer-Opt Algorithm: LBC, LBRY Credits OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.15.5 Algorithm: LBC, LBRY Credits 3 2 1 5K 10K 15K 20K 25K SE +/- 96.84, N = 3 SE +/- 86.86, N = 3 SE +/- 113.58, N = 3 23463 23407 23330 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: shufflenet-v2 3 1 2 2 4 6 8 10 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 7.03 7.05 7.07 MIN: 6.94 / MAX: 7.65 MIN: 6.99 / MAX: 7.63 MIN: 6.95 / MAX: 7.92 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 1 2 3 500 1000 1500 2000 2500 SE +/- 1.96, N = 3 SE +/- 4.53, N = 3 SE +/- 8.06, N = 3 2209.39 2214.44 2221.93 MIN: 2203.59 MIN: 2206.84 MIN: 2205.08 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Cpuminer-Opt Algorithm: x25x OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.15.5 Algorithm: x25x 3 2 1 50 100 150 200 250 SE +/- 0.07, N = 3 SE +/- 0.07, N = 3 SE +/- 1.24, N = 3 220.82 220.73 219.58 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU 2 1 3 2 4 6 8 10 SE +/- 0.00084, N = 3 SE +/- 0.02331, N = 3 SE +/- 0.03820, N = 3 7.39027 7.41691 7.43079 MIN: 7.35 MIN: 7.36 MIN: 7.35 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Cryptsetup Twofish-XTS 256b Encryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Twofish-XTS 256b Encryption 3 2 1 80 160 240 320 400 SE +/- 0.07, N = 3 SE +/- 0.34, N = 3 SE +/- 1.43, N = 3 346.2 345.4 344.4
Cryptsetup PBKDF2-whirlpool OpenBenchmarking.org Iterations Per Second, More Is Better Cryptsetup PBKDF2-whirlpool 2 3 1 110K 220K 330K 440K 550K SE +/- 541.00, N = 3 SE +/- 720.67, N = 3 SE +/- 2447.10, N = 3 532814 532634 530142
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: MobileNet v2 2 3 1 70 140 210 280 350 SE +/- 0.24, N = 3 SE +/- 0.90, N = 3 SE +/- 1.18, N = 3 337.51 338.74 339.21 MIN: 336.44 / MAX: 346.96 MIN: 336.55 / MAX: 341.55 MIN: 336.01 / MAX: 348.17 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
ONNX Runtime Model: shufflenet-v2-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: shufflenet-v2-10 - Device: OpenMP CPU 2 1 3 2K 4K 6K 8K 10K SE +/- 16.03, N = 3 SE +/- 9.85, N = 3 SE +/- 28.57, N = 3 9775 9736 9729 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
rav1e Speed: 6 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 6 2 3 1 0.2408 0.4816 0.7224 0.9632 1.204 SE +/- 0.003, N = 3 SE +/- 0.002, N = 3 SE +/- 0.008, N = 3 1.070 1.068 1.065
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU 2 3 1 0.6759 1.3518 2.0277 2.7036 3.3795 SE +/- 0.01863, N = 3 SE +/- 0.00263, N = 3 SE +/- 0.00849, N = 3 2.99080 2.99407 3.00389 MIN: 2.83 MIN: 2.94 MIN: 2.94 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Cryptsetup Serpent-XTS 256b Encryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Serpent-XTS 256b Encryption 3 2 1 120 240 360 480 600 SE +/- 0.07, N = 3 SE +/- 1.50, N = 3 SE +/- 2.47, N = 3 551.4 550.3 549.0
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU 2 3 1 0.5602 1.1204 1.6806 2.2408 2.801 SE +/- 0.00118, N = 3 SE +/- 0.00042, N = 3 SE +/- 0.00663, N = 3 2.47930 2.48125 2.48994 MIN: 2.45 MIN: 2.45 MIN: 2.45 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
WavPack Audio Encoding WAV To WavPack OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.3 WAV To WavPack 2 3 1 4 8 12 16 20 SE +/- 0.02, N = 5 SE +/- 0.03, N = 5 SE +/- 0.09, N = 5 17.32 17.33 17.40 1. (CXX) g++ options: -rdynamic
Gcrypt Library OpenBenchmarking.org Seconds, Fewer Is Better Gcrypt Library 1.9 2 3 1 60 120 180 240 300 SE +/- 0.33, N = 3 SE +/- 0.41, N = 3 SE +/- 0.82, N = 3 272.41 273.01 273.55 1. (CC) gcc options: -O2 -fvisibility=hidden
dav1d Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Chimera 1080p 10-bit 2 1 3 15 30 45 60 75 SE +/- 0.17, N = 3 SE +/- 0.18, N = 3 SE +/- 0.08, N = 3 68.62 68.44 68.34 MIN: 44.12 / MAX: 171.68 MIN: 43.96 / MAX: 169.86 MIN: 44.14 / MAX: 168.93 1. (CC) gcc options: -pthread
Etcpak Configuration: ETC1 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC1 3 2 1 50 100 150 200 250 SE +/- 0.05, N = 3 SE +/- 0.55, N = 3 SE +/- 0.79, N = 3 236.45 235.97 235.54 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
Cryptsetup Serpent-XTS 256b Decryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Serpent-XTS 256b Decryption 3 1 2 120 240 360 480 600 SE +/- 0.81, N = 3 SE +/- 0.78, N = 3 SE +/- 1.18, N = 3 533.5 532.9 531.5
Etcpak Configuration: ETC2 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC2 1 2 3 30 60 90 120 150 SE +/- 0.03, N = 3 SE +/- 0.38, N = 3 SE +/- 0.53, N = 3 140.73 140.32 140.21 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v3-v3 - Model: mobilenet-v3 1 2 3 1.2218 2.4436 3.6654 4.8872 6.109 SE +/- 0.05, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 5.41 5.41 5.43 MIN: 5.29 / MAX: 5.77 MIN: 5.29 / MAX: 5.54 MIN: 5.28 / MAX: 6.12 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: vgg16 1 3 2 12 24 36 48 60 SE +/- 0.13, N = 3 SE +/- 0.06, N = 3 SE +/- 0.08, N = 3 51.78 51.85 51.97 MIN: 51.37 / MAX: 99.9 MIN: 51.49 / MAX: 53.93 MIN: 51.54 / MAX: 54.71 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
rav1e Speed: 1 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 1 2 1 3 0.0623 0.1246 0.1869 0.2492 0.3115 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 SE +/- 0.001, N = 3 0.277 0.277 0.276
Etcpak Configuration: DXT1 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: DXT1 3 1 2 200 400 600 800 1000 SE +/- 2.11, N = 3 SE +/- 2.71, N = 3 SE +/- 0.37, N = 3 1084.33 1083.86 1080.55 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
Cryptsetup AES-XTS 512b Decryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup AES-XTS 512b Decryption 3 2 1 300 600 900 1200 1500 SE +/- 2.16, N = 3 SE +/- 2.11, N = 3 SE +/- 1.87, N = 3 1394.3 1393.0 1389.6
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: alexnet 1 3 2 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 12.08 12.11 12.12 MIN: 11.98 / MAX: 12.56 MIN: 12.03 / MAX: 12.39 MIN: 12.02 / MAX: 12.63 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
ASKAP Test: tConvolve MT - Gridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve MT - Gridding 1 2 3 300 600 900 1200 1500 SE +/- 0.36, N = 3 SE +/- 0.31, N = 3 SE +/- 0.57, N = 3 1285.02 1284.71 1280.90 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
Cryptsetup Twofish-XTS 512b Decryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Twofish-XTS 512b Decryption 2 1 3 80 160 240 320 400 SE +/- 0.20, N = 3 SE +/- 0.25, N = 3 SE +/- 1.02, N = 3 346.8 346.4 345.7
Cryptsetup Twofish-XTS 256b Decryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Twofish-XTS 256b Decryption 3 1 2 80 160 240 320 400 SE +/- 0.15, N = 3 SE +/- 0.24, N = 3 SE +/- 0.91, N = 3 346.9 346.8 345.8
ONNX Runtime Model: yolov4 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: yolov4 - Device: OpenMP CPU 2 1 3 70 140 210 280 350 SE +/- 0.60, N = 3 SE +/- 1.01, N = 3 SE +/- 0.44, N = 3 328 328 327 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
lzbench Test: Zstd 1 - Process: Decompression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Zstd 1 - Process: Decompression 1 2 3 300 600 900 1200 1500 SE +/- 2.08, N = 3 SE +/- 3.06, N = 3 SE +/- 2.73, N = 3 1390 1389 1386 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
Cryptsetup AES-XTS 512b Encryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup AES-XTS 512b Encryption 3 2 1 300 600 900 1200 1500 SE +/- 0.72, N = 3 SE +/- 3.43, N = 3 SE +/- 1.19, N = 3 1404.2 1401.7 1400.2
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU 2 3 1 900 1800 2700 3600 4500 SE +/- 1.11, N = 3 SE +/- 6.05, N = 3 SE +/- 1.90, N = 3 4040.59 4048.11 4051.99 MIN: 4035.29 MIN: 4033 MIN: 4039.27 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
lzbench Test: Brotli 0 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Brotli 0 - Process: Compression 3 2 1 80 160 240 320 400 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 357 356 356 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 2 1 3 0.758 1.516 2.274 3.032 3.79 SE +/- 0.00051, N = 3 SE +/- 0.01155, N = 3 SE +/- 0.01022, N = 3 3.35947 3.36850 3.36883 MIN: 3.3 MIN: 3.3 MIN: 3.27 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU 1 3 2 0.6387 1.2774 1.9161 2.5548 3.1935 SE +/- 0.00327, N = 3 SE +/- 0.00573, N = 3 SE +/- 0.00201, N = 3 2.83074 2.83373 2.83861 MIN: 2.8 MIN: 2.79 MIN: 2.8 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
dav1d Video Input: Chimera 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Chimera 1080p 2 1 3 100 200 300 400 500 SE +/- 0.88, N = 3 SE +/- 0.72, N = 3 SE +/- 2.60, N = 3 459.32 459.11 458.06 MIN: 342.72 / MAX: 583.61 MIN: 341.89 / MAX: 588.95 MIN: 341.78 / MAX: 587.39 1. (CC) gcc options: -pthread
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: SqueezeNetV1.0 3 2 1 2 4 6 8 10 SE +/- 0.021, N = 3 SE +/- 0.029, N = 3 SE +/- 0.011, N = 3 8.424 8.439 8.447 MIN: 8.33 / MAX: 9.34 MIN: 8.3 / MAX: 52.09 MIN: 8.35 / MAX: 9.56 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
lzbench Test: Zstd 1 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Zstd 1 - Process: Compression 2 1 3 90 180 270 360 450 SE +/- 0.58, N = 3 SE +/- 1.00, N = 3 401 401 400 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
rav1e Speed: 5 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 5 2 1 3 0.1823 0.3646 0.5469 0.7292 0.9115 SE +/- 0.005, N = 3 SE +/- 0.001, N = 3 SE +/- 0.003, N = 3 0.810 0.810 0.808
WebP2 Image Encode Encode Settings: Quality 95, Compression Effort 7 OpenBenchmarking.org Seconds, Fewer Is Better WebP2 Image Encode 20210126 Encode Settings: Quality 95, Compression Effort 7 3 2 1 120 240 360 480 600 SE +/- 0.96, N = 3 SE +/- 0.47, N = 3 SE +/- 1.36, N = 3 544.66 545.66 545.96 1. (CXX) g++ options: -msse4.2 -fno-rtti -O3 -rdynamic -lpthread -ljpeg
OpenFOAM Input: Motorbike 30M OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 8 Input: Motorbike 30M 1 2 3 50 100 150 200 250 SE +/- 0.54, N = 3 SE +/- 0.61, N = 3 SE +/- 1.02, N = 3 219.05 219.10 219.57 1. (CXX) g++ options: -std=c++11 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -ldynamicMesh -lgenericPatchFields -lOpenFOAM -ldl -lm
lzbench Test: Crush 0 - Process: Decompression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Crush 0 - Process: Decompression 2 3 1 90 180 270 360 450 432 431 431 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
Google SynthMark Test: VoiceMark_100 OpenBenchmarking.org Voices, More Is Better Google SynthMark 20201109 Test: VoiceMark_100 1 2 3 120 240 360 480 600 SE +/- 0.84, N = 3 SE +/- 0.57, N = 3 SE +/- 0.80, N = 3 551.18 550.14 549.94 1. (CXX) g++ options: -lm -lpthread -std=c++11 -Ofast
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU 3 1 2 2 4 6 8 10 SE +/- 0.00234, N = 3 SE +/- 0.00368, N = 3 SE +/- 0.00711, N = 3 8.38226 8.39325 8.40124 MIN: 8.31 MIN: 8.35 MIN: 8.35 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU 1 3 2 1.3001 2.6002 3.9003 5.2004 6.5005 SE +/- 0.01646, N = 3 SE +/- 0.00979, N = 3 SE +/- 0.00194, N = 3 5.76536 5.76890 5.77823 MIN: 5.7 MIN: 5.71 MIN: 5.74 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
WebP2 Image Encode Encode Settings: Quality 75, Compression Effort 7 OpenBenchmarking.org Seconds, Fewer Is Better WebP2 Image Encode 20210126 Encode Settings: Quality 75, Compression Effort 7 2 3 1 70 140 210 280 350 SE +/- 0.19, N = 3 SE +/- 0.52, N = 3 SE +/- 0.36, N = 3 298.25 298.54 298.92 1. (CXX) g++ options: -msse4.2 -fno-rtti -O3 -rdynamic -lpthread -ljpeg
Cryptsetup Serpent-XTS 512b Decryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Serpent-XTS 512b Decryption 1 2 3 120 240 360 480 600 SE +/- 0.52, N = 3 SE +/- 0.39, N = 3 533.7 533.3 532.6
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 2 1 3 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 12.69 12.71 12.72 MIN: 12.63 MIN: 12.64 MIN: 12.64 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Cryptsetup Serpent-XTS 512b Encryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Serpent-XTS 512b Encryption 3 2 1 120 240 360 480 600 SE +/- 0.26, N = 3 SE +/- 0.57, N = 3 SE +/- 0.60, N = 2 551.7 551.2 550.7
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: inception-v3 1 2 3 13 26 39 52 65 SE +/- 0.19, N = 3 SE +/- 0.21, N = 3 SE +/- 0.26, N = 3 55.87 55.92 55.96 MIN: 55.38 / MAX: 87.21 MIN: 55.44 / MAX: 148.38 MIN: 55.41 / MAX: 122.63 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Etcpak Configuration: ETC1 + Dithering OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC1 + Dithering 2 3 1 50 100 150 200 250 SE +/- 0.03, N = 3 SE +/- 0.05, N = 3 SE +/- 0.40, N = 3 227.14 227.06 226.75 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
ASKAP Test: Hogbom Clean OpenMP OpenBenchmarking.org Iterations Per Second, More Is Better ASKAP 1.0 Test: Hogbom Clean OpenMP 3 2 1 50 100 150 200 250 SE +/- 0.19, N = 3 SE +/- 0.19, N = 3 SE +/- 0.33, N = 3 239.62 239.43 239.24 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
Timed Godot Game Engine Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Godot Game Engine Compilation 3.2.3 Time To Compile 3 1 2 40 80 120 160 200 SE +/- 0.38, N = 3 SE +/- 0.10, N = 3 SE +/- 0.26, N = 3 186.56 186.70 186.84
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: squeezenet_ssd 1 2 3 6 12 18 24 30 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 26.55 26.57 26.59 MIN: 26.01 / MAX: 27.38 MIN: 26.04 / MAX: 27.24 MIN: 26.05 / MAX: 27.21 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
WebP2 Image Encode Encode Settings: Quality 100, Lossless Compression OpenBenchmarking.org Seconds, Fewer Is Better WebP2 Image Encode 20210126 Encode Settings: Quality 100, Lossless Compression 2 1 3 200 400 600 800 1000 SE +/- 0.53, N = 3 SE +/- 0.20, N = 3 SE +/- 0.05, N = 3 938.92 939.76 940.23 1. (CXX) g++ options: -msse4.2 -fno-rtti -O3 -rdynamic -lpthread -ljpeg
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU 3 1 2 1.1424 2.2848 3.4272 4.5696 5.712 SE +/- 0.00135, N = 3 SE +/- 0.00123, N = 3 SE +/- 0.00166, N = 3 5.07005 5.07564 5.07714 MIN: 5.04 MIN: 5.05 MIN: 5.06 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 3 1 2 4 8 12 16 20 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 13.78 13.79 13.80 MIN: 13.71 MIN: 13.69 MIN: 13.72 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU 2 1 3 0.918 1.836 2.754 3.672 4.59 SE +/- 0.01870, N = 3 SE +/- 0.00717, N = 3 SE +/- 0.02757, N = 3 4.07451 4.07476 4.07980 MIN: 3.98 MIN: 4 MIN: 3.99 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 3 2 1 900 1800 2700 3600 4500 SE +/- 3.06, N = 3 SE +/- 1.68, N = 3 SE +/- 2.55, N = 3 4040.61 4044.15 4045.79 MIN: 4031.1 MIN: 4035.82 MIN: 4037.43 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
QuantLib OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.21 3 2 1 400 800 1200 1600 2000 SE +/- 4.94, N = 3 SE +/- 2.43, N = 3 SE +/- 3.77, N = 3 1691.6 1691.5 1689.6 1. (CXX) g++ options: -O3 -march=native -rdynamic
lzbench Test: Libdeflate 1 - Process: Decompression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Libdeflate 1 - Process: Decompression 3 1 2 200 400 600 800 1000 SE +/- 0.33, N = 3 995 995 994 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
Cryptsetup PBKDF2-sha512 OpenBenchmarking.org Iterations Per Second, More Is Better Cryptsetup PBKDF2-sha512 2 3 1 300K 600K 900K 1200K 1500K SE +/- 562.33, N = 3 1330118 1328993 1328993
ONNX Runtime Model: super-resolution-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: super-resolution-10 - Device: OpenMP CPU 2 1 3 900 1800 2700 3600 4500 SE +/- 5.93, N = 3 SE +/- 9.80, N = 3 SE +/- 8.12, N = 3 4008 4007 4005 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 1 2 3 900 1800 2700 3600 4500 SE +/- 5.71, N = 3 SE +/- 6.88, N = 3 SE +/- 5.47, N = 3 4048.95 4050.54 4051.42 MIN: 4036 MIN: 4036.39 MIN: 4040.91 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
WebP2 Image Encode Encode Settings: Quality 100, Compression Effort 5 OpenBenchmarking.org Seconds, Fewer Is Better WebP2 Image Encode 20210126 Encode Settings: Quality 100, Compression Effort 5 3 2 1 4 8 12 16 20 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 14.48 14.48 14.48 1. (CXX) g++ options: -msse4.2 -fno-rtti -O3 -rdynamic -lpthread -ljpeg
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU 3 2 1 500 1000 1500 2000 2500 SE +/- 2.02, N = 3 SE +/- 0.63, N = 3 SE +/- 5.90, N = 3 2210.61 2211.32 2211.56 MIN: 2204.44 MIN: 2206.12 MIN: 2201.95 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Cryptsetup Twofish-XTS 512b Encryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Twofish-XTS 512b Encryption 2 3 1 80 160 240 320 400 SE +/- 0.17, N = 3 SE +/- 0.10, N = 3 345.7 345.6 345.6
ONNX Runtime Model: fcn-resnet101-11 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: fcn-resnet101-11 - Device: OpenMP CPU 3 2 1 13 26 39 52 65 SE +/- 0.17, N = 3 SE +/- 0.00, N = 3 59 59 59 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
lzbench Test: Libdeflate 1 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Libdeflate 1 - Process: Compression 3 2 1 40 80 120 160 200 182 182 182 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: Brotli 2 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Brotli 2 - Process: Compression 3 2 1 30 60 90 120 150 148 148 148 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: Crush 0 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Crush 0 - Process: Compression 3 2 1 20 40 60 80 100 76 76 76 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: Zstd 8 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: Zstd 8 - Process: Compression 3 2 1 15 30 45 60 75 SE +/- 0.58, N = 3 65 65 65 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: XZ 0 - Process: Decompression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: XZ 0 - Process: Decompression 3 2 1 20 40 60 80 100 96 96 96 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: XZ 0 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 1.8 Test: XZ 0 - Process: Compression 3 2 1 8 16 24 32 40 SE +/- 0.33, N = 3 34 34 34 1. (CXX) g++ options: -pthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
Redis Test: LPOP OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: LPOP 1 3 2 400K 800K 1200K 1600K 2000K SE +/- 6269.54, N = 3 SE +/- 91181.44, N = 12 SE +/- 4394.55, N = 3 2010121.55 1758555.23 1271416.38 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
ASKAP Test: tConvolve OpenMP - Gridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve OpenMP - Gridding 3 2 1 400 800 1200 1600 2000 SE +/- 12.81, N = 3 SE +/- 31.45, N = 15 SE +/- 39.18, N = 15 1962.74 1881.94 1778.10 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASKAP Test: tConvolve MPI - Gridding OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Gridding 1 2 3 400 800 1200 1600 2000 SE +/- 65.46, N = 3 SE +/- 45.52, N = 15 SE +/- 47.93, N = 12 2071.46 1984.73 1979.00 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASKAP Test: tConvolve MPI - Degridding OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Degridding 1 3 2 400 800 1200 1600 2000 SE +/- 18.20, N = 3 SE +/- 29.00, N = 12 SE +/- 28.57, N = 15 1699.85 1597.84 1589.75 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
Cpuminer-Opt Algorithm: Triple SHA-256, Onecoin OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.15.5 Algorithm: Triple SHA-256, Onecoin 3 1 2 13K 26K 39K 52K 65K SE +/- 80.07, N = 3 SE +/- 902.62, N = 15 SE +/- 1076.82, N = 15 62923 61247 60350 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
Cpuminer-Opt Algorithm: Quad SHA-256, Pyrite OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.15.5 Algorithm: Quad SHA-256, Pyrite 2 3 1 10K 20K 30K 40K 50K SE +/- 748.75, N = 15 SE +/- 1271.64, N = 15 SE +/- 749.50, N = 3 48622 48341 47653 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
Cpuminer-Opt Algorithm: Deepcoin OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.15.5 Algorithm: Deepcoin 1 3 2 1600 3200 4800 6400 8000 SE +/- 132.89, N = 15 SE +/- 33.20, N = 3 SE +/- 164.38, N = 15 7487.98 7312.24 7013.87 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
LAMMPS Molecular Dynamics Simulator Model: Rhodopsin Protein OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: Rhodopsin Protein 3 1 2 1.0042 2.0084 3.0126 4.0168 5.021 SE +/- 0.121, N = 15 SE +/- 0.131, N = 15 SE +/- 0.017, N = 3 4.463 4.415 4.261 1. (CXX) g++ options: -O3 -pthread -lm
CLOMP Static OMP Speedup OpenBenchmarking.org Speedup, More Is Better CLOMP 1.2 Static OMP Speedup 1 3 2 3 6 9 12 15 SE +/- 0.13, N = 3 SE +/- 0.09, N = 3 SE +/- 1.35, N = 12 13.2 13.0 11.0 1. (CC) gcc options: -fopenmp -O3 -lm
Phoronix Test Suite v10.8.5