xeon auggy 2 x Intel Xeon Platinum 8380 testing with a Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS) and ASPEED on Ubuntu 22.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2308067-NE-XEONAUGGY65&grw .
xeon auggy Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Vulkan Compiler File-System Screen Resolution a b 2 x Intel Xeon Platinum 8380 @ 3.40GHz (80 Cores / 160 Threads) Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS) Intel Ice Lake IEH 512GB 7682GB INTEL SSDPF2KX076TZ ASPEED VE228 2 x Intel X710 for 10GBASE-T + 2 x Intel E810-C for QSFP Ubuntu 22.10 6.2.0-rc5-phx-dodt (x86_64) GNOME Shell 43.0 X Server 1.21.1.3 1.3.224 GCC 12.2.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate performance (EPP: performance) - CPU Microcode: 0xd000389 Java Details - OpenJDK Runtime Environment (build 11.0.19+7-post-Ubuntu-0ubuntu122.10.1) Python Details - Python 3.10.7 Security Details - dodt: Mitigation of DOITM + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected
xeon auggy heffte: r2c - FFTW - double - 256 heffte: c2c - Stock - float - 256 heffte: r2c - FFTW - double - 128 heffte: c2c - FFTW - double - 256 heffte: c2c - Stock - float - 128 heffte: r2c - FFTW - float - 256 heffte: c2c - FFTW - double - 128 heffte: c2c - FFTW - float - 256 heffte: r2c - FFTW - float - 128 heffte: c2c - FFTW - float - 128 libxsmm: 64 libxsmm: 32 heffte: r2c - Stock - float - 128 heffte: r2c - Stock - float - 256 stress-ng: Pipe stress-ng: Zlib stress-ng: Cloning stress-ng: Pthread stress-ng: AVL Tree stress-ng: Floating Point stress-ng: Matrix 3D Math stress-ng: Vector Shuffle stress-ng: Fused Multiply-Add remhos: Sample Remap Example heffte: r2c - Stock - double - 256 heffte: r2c - Stock - double - 128 heffte: c2c - Stock - double - 256 heffte: c2c - Stock - double - 128 stress-ng: Wide Vector Math stress-ng: Vector Floating Point encode-opus: WAV To Opus Encode quantlib: ncnn: CPU - mobilenet ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU - shufflenet-v2 ncnn: CPU - mnasnet ncnn: CPU - efficientnet-b0 ncnn: CPU - blazeface ncnn: CPU - googlenet ncnn: CPU - vgg16 ncnn: CPU - resnet18 ncnn: CPU - alexnet ncnn: CPU - resnet50 ncnn: CPU - yolov4-tiny ncnn: CPU - squeezenet_ssd ncnn: CPU - regnety_400m ncnn: CPU - vision_transformer ncnn: CPU - FastestDet gpaw: Carbon Nanotube build-gcc: Time To Compile dav1d: Chimera 1080p dav1d: Summer Nature 4K dav1d: Summer Nature 1080p dav1d: Chimera 1080p 10-bit blender: BMW27 - CPU-Only blender: Classroom - CPU-Only blender: Fishy Cat - CPU-Only blender: Barbershop - CPU-Only vvenc: Bosphorus 4K - Fast vvenc: Bosphorus 1080p - Fast z3: 2.smt2 heffte: r2c - Stock - double - 512 z3: 1.smt2 heffte: r2c - Stock - float - 512 heffte: c2c - Stock - float - 512 heffte: r2c - FFTW - double - 512 heffte: c2c - FFTW - double - 512 heffte: r2c - FFTW - float - 512 heffte: c2c - FFTW - float - 512 vvenc: Bosphorus 4K - Faster vvenc: Bosphorus 1080p - Faster embree: Pathtracer - Crown embree: Pathtracer ISPC - Crown libxsmm: 256 heffte: c2c - Stock - double - 512 embree: Pathtracer - Asian Dragon embree: Pathtracer - Asian Dragon Obj embree: Pathtracer ISPC - Asian Dragon embree: Pathtracer ISPC - Asian Dragon Obj oidn: RT.hdr_alb_nrm.3840x2160 - CPU-Only oidn: RT.ldr_alb_nrm.3840x2160 - CPU-Only oidn: RTLightmap.hdr.4096x4096 - CPU-Only ospray: particle_volume/ao/real_time ospray: particle_volume/scivis/real_time ospray: particle_volume/pathtracer/real_time ospray: gravity_spheres_volume/dim_512/ao/real_time ospray: gravity_spheres_volume/dim_512/scivis/real_time ospray: gravity_spheres_volume/dim_512/pathtracer/real_time liquid-dsp: 32 - 256 - 32 liquid-dsp: 32 - 256 - 57 liquid-dsp: 64 - 256 - 32 liquid-dsp: 64 - 256 - 57 liquid-dsp: 128 - 256 - 32 liquid-dsp: 128 - 256 - 57 liquid-dsp: 160 - 256 - 32 liquid-dsp: 160 - 256 - 57 liquid-dsp: 32 - 256 - 512 liquid-dsp: 64 - 256 - 512 liquid-dsp: 128 - 256 - 512 liquid-dsp: 160 - 256 - 512 liquid-dsp: 1 - 256 - 32 liquid-dsp: 1 - 256 - 57 liquid-dsp: 1 - 256 - 512 liquid-dsp: 16 - 256 - 32 liquid-dsp: 16 - 256 - 57 liquid-dsp: 16 - 256 - 512 srsran: Downlink Processor Benchmark srsran: PUSCH Processor Benchmark, Throughput Total srsran: PUSCH Processor Benchmark, Throughput Thread couchdb: 100 - 1000 - 30 couchdb: 300 - 1000 - 30 couchdb: 500 - 1000 - 30 apache-iotdb: 100 - 1 - 200 apache-iotdb: 100 - 1 - 200 apache-iotdb: 100 - 1 - 500 apache-iotdb: 100 - 1 - 500 apache-iotdb: 200 - 1 - 200 apache-iotdb: 200 - 1 - 200 apache-iotdb: 200 - 1 - 500 apache-iotdb: 200 - 1 - 500 apache-iotdb: 500 - 1 - 200 apache-iotdb: 500 - 1 - 200 apache-iotdb: 500 - 1 - 500 apache-iotdb: 500 - 1 - 500 apache-iotdb: 100 - 100 - 200 apache-iotdb: 100 - 100 - 200 apache-iotdb: 100 - 100 - 500 apache-iotdb: 100 - 100 - 500 libxsmm: 128 dragonflydb: 60 - 1:100 a b 93.0098 101.8 156.224 45.8509 107.452 222.215 94.4544 102.278 199.103 159.344 1219.9 633.2 185.453 236.666 40500166.81 6879.86 16195.03 92131.54 610.69 21134.81 12743.81 48054.48 181083180.47 12.195 101.938 117.006 46.6636 69.4816 2195391.41 132479.08 36.736 2622.9 16.06 7.91 8.71 9.82 7.61 11.48 4.49 17.06 26.27 10.31 5.65 17.57 24.77 15.78 38.20 46.50 9.62 45.824 957.946 516.17 282.53 699.97 476.82 23.69 62.35 30.59 239.55 5.672 15.708 87.998 94.2637 25.713 176.630 93.3349 90.5745 49.4363 170.906 94.8348 10.284 29.077 72.0419 87.9306 599.8 47.2801 85.2423 76.9608 104.4148 89.8447 3.04 3.05 1.47 24.637 24.3592 151.138 21.2056 20.811 22.6977 992540000 1197700000 1805000000 2069200000 2961100000 2519200000 3390700000 2602300000 400730000 725840000 949400000 1013200000 32338000 53918500 13323000 493660000 615105000 201615000 556.5 9800.5 164.8 94.834 152.456 1090.424 638644.35 17.54 995259.68 35.99 904320.6 14.84 1134736.54 36.74 1199743.22 13.25 1343156.56 33.38 34266143.85 42.79 39562245.22 109.4 1055.3 92.8161 104.345 148.177 46.4607 107.952 230.541 91.9039 98.8315 199.333 158.711 1216.0 639.3 182.03 236.119 49325396.97 6880.22 13172.81 90361.7 610.83 21133.02 12742.70 48076.78 181314757.42 12.365 102.426 116.839 46.5052 67.9483 2196242.21 131100.25 36.726 2607.6 15.90 7.94 8.77 9.75 7.43 11.64 4.61 16.72 26.19 9.63 5.44 18.15 24.48 16.13 39.37 45.56 10.01 45.636 956.127 516.50 282.65 699.09 476.77 23.62 62.51 30.77 239.03 5.717 15.723 87.178 92.7173 25.251 174.114 92.2568 90.2388 48.7210 171.134 94.3442 10.430 29.176 70.7905 87.9319 592.5 47.3856 85.1315 77.2633 104.5539 90.0132 3.01 3.03 1.46 24.6207 24.7827 151.273 20.9459 20.4818 22.5752 993445000 1185500000 1825450000 2076650000 2945150000 2426350000 3381950000 2636450000 396265000 730310000 945190000 1011200000 32267000 53926500 13291000 498410000 623535000 198790000 556.8 9756.7 164.7 628202.55 18.04 992909.69 36.16 920435.77 14.42 1137612.61 36.82 1141859.25 14.12 1372429.58 32.75 34807016.85 42.19 43021501.4 96.18 1946.7 OpenBenchmarking.org
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256 a b 20 40 60 80 100 SE +/- 1.52, N = 2 93.01 92.82 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: float - X Y Z: 256 a b 20 40 60 80 100 SE +/- 0.34, N = 2 101.80 104.35 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128 a b 30 60 90 120 150 SE +/- 2.45, N = 2 156.22 148.18 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256 a b 11 22 33 44 55 SE +/- 0.55, N = 2 45.85 46.46 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: float - X Y Z: 128 a b 20 40 60 80 100 SE +/- 1.54, N = 2 107.45 107.95 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256 a b 50 100 150 200 250 SE +/- 3.31, N = 2 222.22 230.54 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128 a b 20 40 60 80 100 SE +/- 1.46, N = 2 94.45 91.90 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256 a b 20 40 60 80 100 SE +/- 0.00, N = 2 102.28 98.83 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128 a b 40 80 120 160 200 SE +/- 0.39, N = 2 199.10 199.33 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128 a b 40 80 120 160 200 SE +/- 0.21, N = 2 159.34 158.71 1. (CXX) g++ options: -O3
libxsmm M N K: 64 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 64 a b 300 600 900 1200 1500 SE +/- 1.25, N = 2 1219.9 1216.0 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
libxsmm M N K: 32 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 32 a b 140 280 420 560 700 SE +/- 2.35, N = 2 633.2 639.3 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: float - X Y Z: 128 a b 40 80 120 160 200 SE +/- 0.86, N = 2 185.45 182.03 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: float - X Y Z: 256 a b 50 100 150 200 250 SE +/- 2.75, N = 2 236.67 236.12 1. (CXX) g++ options: -O3
Stress-NG Test: Pipe OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Pipe a b 11M 22M 33M 44M 55M SE +/- 2523742.74, N = 2 SE +/- 6369572.71, N = 2 40500166.81 49325396.97 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Zlib OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Zlib a b 1500 3000 4500 6000 7500 SE +/- 8.83, N = 2 SE +/- 4.39, N = 2 6879.86 6880.22 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Cloning OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Cloning a b 3K 6K 9K 12K 15K SE +/- 3270.70, N = 2 SE +/- 654.66, N = 2 16195.03 13172.81 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Pthread OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Pthread a b 20K 40K 60K 80K 100K SE +/- 279.15, N = 2 SE +/- 894.90, N = 2 92131.54 90361.70 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: AVL Tree OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: AVL Tree a b 130 260 390 520 650 SE +/- 0.08, N = 2 SE +/- 0.44, N = 2 610.69 610.83 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Floating Point OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Floating Point a b 5K 10K 15K 20K 25K SE +/- 6.86, N = 2 SE +/- 9.14, N = 2 21134.81 21133.02 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Matrix 3D Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Matrix 3D Math a b 3K 6K 9K 12K 15K SE +/- 9.80, N = 2 SE +/- 5.06, N = 2 12743.81 12742.70 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Vector Shuffle OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Vector Shuffle a b 10K 20K 30K 40K 50K SE +/- 1.26, N = 2 SE +/- 42.22, N = 2 48054.48 48076.78 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Fused Multiply-Add OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Fused Multiply-Add a b 40M 80M 120M 160M 200M SE +/- 118686.48, N = 2 SE +/- 92010.25, N = 2 181083180.47 181314757.42 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Remhos Test: Sample Remap Example OpenBenchmarking.org Seconds, Fewer Is Better Remhos 1.0 Test: Sample Remap Example a b 3 6 9 12 15 SE +/- 0.10, N = 2 12.20 12.37 1. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double - X Y Z: 256 a b 20 40 60 80 100 SE +/- 0.72, N = 2 101.94 102.43 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double - X Y Z: 128 a b 30 60 90 120 150 SE +/- 4.04, N = 2 117.01 116.84 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double - X Y Z: 256 a b 11 22 33 44 55 SE +/- 0.08, N = 2 46.66 46.51 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double - X Y Z: 128 a b 15 30 45 60 75 SE +/- 1.01, N = 2 69.48 67.95 1. (CXX) g++ options: -O3
Stress-NG Test: Wide Vector Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Wide Vector Math a b 500K 1000K 1500K 2000K 2500K SE +/- 1200.16, N = 2 SE +/- 497.69, N = 2 2195391.41 2196242.21 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Vector Floating Point OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Vector Floating Point a b 30K 60K 90K 120K 150K SE +/- 872.22, N = 2 SE +/- 235.00, N = 2 132479.08 131100.25 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.4 WAV To Opus Encode a b 8 16 24 32 40 SE +/- 0.01, N = 2 36.74 36.73 1. (CXX) g++ options: -O3 -fvisibility=hidden -logg -lm
QuantLib OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.30 a b 600 1200 1800 2400 3000 SE +/- 2.00, N = 2 SE +/- 1.80, N = 2 2622.9 2607.6 1. (CXX) g++ options: -O3 -march=native -fPIE -pie
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: mobilenet a b 4 8 12 16 20 SE +/- 0.82, N = 2 SE +/- 0.28, N = 2 16.06 15.90 MIN: 14.92 / MAX: 25.43 MIN: 15.2 / MAX: 39.56 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v2-v2 - Model: mobilenet-v2 a b 2 4 6 8 10 SE +/- 0.13, N = 2 SE +/- 0.02, N = 2 7.91 7.94 MIN: 7.68 / MAX: 9.6 MIN: 7.81 / MAX: 10.99 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3 - Model: mobilenet-v3 a b 2 4 6 8 10 SE +/- 0.14, N = 2 SE +/- 0.03, N = 2 8.71 8.77 MIN: 8.43 / MAX: 9.8 MIN: 8.59 / MAX: 32.78 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: shufflenet-v2 a b 3 6 9 12 15 SE +/- 0.03, N = 2 SE +/- 0.06, N = 2 9.82 9.75 MIN: 9.6 / MAX: 12.61 MIN: 9.56 / MAX: 13.68 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: mnasnet a b 2 4 6 8 10 SE +/- 0.07, N = 2 SE +/- 0.01, N = 2 7.61 7.43 MIN: 7.33 / MAX: 43.29 MIN: 7.16 / MAX: 15.37 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: efficientnet-b0 a b 3 6 9 12 15 SE +/- 0.23, N = 2 SE +/- 0.26, N = 2 11.48 11.64 MIN: 10.9 / MAX: 56.34 MIN: 10.85 / MAX: 37.54 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: blazeface a b 1.0373 2.0746 3.1119 4.1492 5.1865 SE +/- 0.09, N = 2 SE +/- 0.01, N = 2 4.49 4.61 MIN: 4.31 / MAX: 5.13 MIN: 4.49 / MAX: 5.33 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: googlenet a b 4 8 12 16 20 SE +/- 1.05, N = 2 SE +/- 0.37, N = 2 17.06 16.72 MIN: 15.5 / MAX: 66.12 MIN: 15.67 / MAX: 100.92 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: vgg16 a b 6 12 18 24 30 SE +/- 0.84, N = 2 SE +/- 0.34, N = 2 26.27 26.19 MIN: 24.05 / MAX: 301.35 MIN: 24.19 / MAX: 341.6 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: resnet18 a b 3 6 9 12 15 SE +/- 1.04, N = 2 SE +/- 0.31, N = 2 10.31 9.63 MIN: 9.03 / MAX: 33.3 MIN: 9.16 / MAX: 26.03 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: alexnet a b 1.2713 2.5426 3.8139 5.0852 6.3565 SE +/- 0.43, N = 2 SE +/- 0.22, N = 2 5.65 5.44 MIN: 5.03 / MAX: 6.71 MIN: 5.08 / MAX: 7.52 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: resnet50 a b 4 8 12 16 20 SE +/- 0.36, N = 2 SE +/- 0.83, N = 2 17.57 18.15 MIN: 16.92 / MAX: 18.88 MIN: 16.98 / MAX: 42.81 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: yolov4-tiny a b 6 12 18 24 30 SE +/- 0.65, N = 2 SE +/- 0.64, N = 2 24.77 24.48 MIN: 22.68 / MAX: 208.18 MIN: 22.66 / MAX: 47.54 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: squeezenet_ssd a b 4 8 12 16 20 SE +/- 0.07, N = 2 SE +/- 0.41, N = 2 15.78 16.13 MIN: 15.4 / MAX: 43.08 MIN: 15.35 / MAX: 39.36 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: regnety_400m a b 9 18 27 36 45 SE +/- 0.86, N = 2 SE +/- 1.10, N = 2 38.20 39.37 MIN: 36.18 / MAX: 62.76 MIN: 37.07 / MAX: 103.97 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: vision_transformer a b 11 22 33 44 55 SE +/- 2.46, N = 2 SE +/- 1.19, N = 2 46.50 45.56 MIN: 42.6 / MAX: 72.28 MIN: 43.24 / MAX: 70.59 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: FastestDet a b 3 6 9 12 15 SE +/- 0.05, N = 2 SE +/- 0.28, N = 2 9.62 10.01 MIN: 9.35 / MAX: 10.52 MIN: 9.4 / MAX: 59.43 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
GPAW Input: Carbon Nanotube OpenBenchmarking.org Seconds, Fewer Is Better GPAW 23.6 Input: Carbon Nanotube a b 10 20 30 40 50 SE +/- 0.02, N = 2 SE +/- 0.03, N = 2 45.82 45.64 1. (CC) gcc options: -shared -fwrapv -O2 -lxc -lblas -lmpi
Timed GCC Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed GCC Compilation 13.2 Time To Compile a b 200 400 600 800 1000 SE +/- 1.97, N = 2 957.95 956.13
dav1d Video Input: Chimera 1080p OpenBenchmarking.org FPS, More Is Better dav1d 1.2.1 Video Input: Chimera 1080p a b 110 220 330 440 550 SE +/- 0.06, N = 2 516.17 516.50 1. (CC) gcc options: -pthread -lm
dav1d Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better dav1d 1.2.1 Video Input: Summer Nature 4K a b 60 120 180 240 300 SE +/- 0.08, N = 2 282.53 282.65 1. (CC) gcc options: -pthread -lm
dav1d Video Input: Summer Nature 1080p OpenBenchmarking.org FPS, More Is Better dav1d 1.2.1 Video Input: Summer Nature 1080p a b 150 300 450 600 750 SE +/- 0.50, N = 2 699.97 699.09 1. (CC) gcc options: -pthread -lm
dav1d Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better dav1d 1.2.1 Video Input: Chimera 1080p 10-bit a b 100 200 300 400 500 SE +/- 0.41, N = 2 476.82 476.77 1. (CC) gcc options: -pthread -lm
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: BMW27 - Compute: CPU-Only a b 6 12 18 24 30 SE +/- 0.09, N = 2 SE +/- 0.06, N = 2 23.69 23.62
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Classroom - Compute: CPU-Only a b 14 28 42 56 70 SE +/- 0.04, N = 2 SE +/- 0.19, N = 2 62.35 62.51
Blender Blend File: Fishy Cat - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Fishy Cat - Compute: CPU-Only a b 7 14 21 28 35 SE +/- 0.01, N = 2 SE +/- 0.02, N = 2 30.59 30.77
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Barbershop - Compute: CPU-Only a b 50 100 150 200 250 SE +/- 0.32, N = 2 SE +/- 1.32, N = 2 239.55 239.03
VVenC Video Input: Bosphorus 4K - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.9 Video Input: Bosphorus 4K - Video Preset: Fast a b 1.2863 2.5726 3.8589 5.1452 6.4315 SE +/- 0.019, N = 2 SE +/- 0.063, N = 2 5.672 5.717 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
VVenC Video Input: Bosphorus 1080p - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.9 Video Input: Bosphorus 1080p - Video Preset: Fast a b 4 8 12 16 20 SE +/- 0.06, N = 2 SE +/- 0.04, N = 2 15.71 15.72 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
Z3 Theorem Prover SMT File: 2.smt2 OpenBenchmarking.org Seconds, Fewer Is Better Z3 Theorem Prover 4.12.1 SMT File: 2.smt2 a b 20 40 60 80 100 SE +/- 0.02, N = 2 SE +/- 0.05, N = 2 88.00 87.18 1. (CXX) g++ options: -lpthread -std=c++17 -fvisibility=hidden -mfpmath=sse -msse -msse2 -O3 -fPIC
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double - X Y Z: 512 a b 20 40 60 80 100 SE +/- 0.13, N = 2 SE +/- 0.73, N = 2 94.26 92.72 1. (CXX) g++ options: -O3
Z3 Theorem Prover SMT File: 1.smt2 OpenBenchmarking.org Seconds, Fewer Is Better Z3 Theorem Prover 4.12.1 SMT File: 1.smt2 a b 6 12 18 24 30 SE +/- 0.03, N = 2 SE +/- 0.07, N = 2 25.71 25.25 1. (CXX) g++ options: -lpthread -std=c++17 -fvisibility=hidden -mfpmath=sse -msse -msse2 -O3 -fPIC
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: float - X Y Z: 512 a b 40 80 120 160 200 SE +/- 0.68, N = 2 SE +/- 0.12, N = 2 176.63 174.11 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: float - X Y Z: 512 a b 20 40 60 80 100 SE +/- 0.40, N = 2 SE +/- 0.24, N = 2 93.33 92.26 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512 a b 20 40 60 80 100 SE +/- 1.08, N = 2 SE +/- 1.17, N = 2 90.57 90.24 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512 a b 11 22 33 44 55 SE +/- 0.29, N = 2 SE +/- 0.39, N = 2 49.44 48.72 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512 a b 40 80 120 160 200 SE +/- 1.25, N = 2 SE +/- 1.39, N = 2 170.91 171.13 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512 a b 20 40 60 80 100 SE +/- 1.11, N = 2 SE +/- 0.95, N = 2 94.83 94.34 1. (CXX) g++ options: -O3
VVenC Video Input: Bosphorus 4K - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.9 Video Input: Bosphorus 4K - Video Preset: Faster a b 3 6 9 12 15 SE +/- 0.03, N = 2 SE +/- 0.10, N = 2 10.28 10.43 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
VVenC Video Input: Bosphorus 1080p - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.9 Video Input: Bosphorus 1080p - Video Preset: Faster a b 7 14 21 28 35 SE +/- 0.37, N = 2 SE +/- 0.23, N = 2 29.08 29.18 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
Embree Binary: Pathtracer - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer - Model: Crown a b 16 32 48 64 80 SE +/- 0.12, N = 2 72.04 70.79 MIN: 68.2 / MAX: 79.55 MIN: 67 / MAX: 79.71
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer ISPC - Model: Crown a b 20 40 60 80 100 SE +/- 0.10, N = 2 87.93 87.93 MIN: 85.27 / MAX: 92.58 MIN: 84.73 / MAX: 92.37
libxsmm M N K: 256 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 256 a b 130 260 390 520 650 SE +/- 2.65, N = 2 599.8 592.5 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double - X Y Z: 512 a b 11 22 33 44 55 SE +/- 0.14, N = 2 SE +/- 0.07, N = 2 47.28 47.39 1. (CXX) g++ options: -O3
Embree Binary: Pathtracer - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer - Model: Asian Dragon a b 20 40 60 80 100 SE +/- 0.04, N = 2 SE +/- 0.14, N = 2 85.24 85.13 MIN: 83.75 / MAX: 89.99 MIN: 83.65 / MAX: 90.45
Embree Binary: Pathtracer - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer - Model: Asian Dragon Obj a b 20 40 60 80 100 SE +/- 0.03, N = 2 SE +/- 0.03, N = 2 76.96 77.26 MIN: 75.53 / MAX: 82.14 MIN: 75.78 / MAX: 81.08
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer ISPC - Model: Asian Dragon a b 20 40 60 80 100 SE +/- 0.32, N = 2 SE +/- 0.24, N = 2 104.41 104.55 MIN: 101.88 / MAX: 109.22 MIN: 102.2 / MAX: 108.91
Embree Binary: Pathtracer ISPC - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer ISPC - Model: Asian Dragon Obj a b 20 40 60 80 100 SE +/- 0.06, N = 2 SE +/- 0.05, N = 2 89.84 90.01 MIN: 87.68 / MAX: 94.71 MIN: 87.6 / MAX: 94.43
Intel Open Image Denoise Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.0 Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only a b 0.684 1.368 2.052 2.736 3.42 SE +/- 0.00, N = 2 3.04 3.01
Intel Open Image Denoise Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.0 Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only a b 0.6863 1.3726 2.0589 2.7452 3.4315 SE +/- 0.01, N = 2 3.05 3.03
Intel Open Image Denoise Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.0 Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only a b 0.3308 0.6616 0.9924 1.3232 1.654 SE +/- 0.00, N = 2 1.47 1.46
OSPRay Benchmark: particle_volume/ao/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: particle_volume/ao/real_time a b 6 12 18 24 30 SE +/- 0.09, N = 2 24.64 24.62
OSPRay Benchmark: particle_volume/scivis/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: particle_volume/scivis/real_time a b 6 12 18 24 30 SE +/- 0.03, N = 2 24.36 24.78
OSPRay Benchmark: particle_volume/pathtracer/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: particle_volume/pathtracer/real_time a b 30 60 90 120 150 SE +/- 0.63, N = 2 151.14 151.27
OSPRay Benchmark: gravity_spheres_volume/dim_512/ao/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: gravity_spheres_volume/dim_512/ao/real_time a b 5 10 15 20 25 SE +/- 0.22, N = 2 21.21 20.95
OSPRay Benchmark: gravity_spheres_volume/dim_512/scivis/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: gravity_spheres_volume/dim_512/scivis/real_time a b 5 10 15 20 25 SE +/- 0.05, N = 2 20.81 20.48
OSPRay Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time a b 5 10 15 20 25 SE +/- 0.00, N = 2 22.70 22.58
Liquid-DSP Threads: 32 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 32 - Buffer Length: 256 - Filter Length: 32 a b 200M 400M 600M 800M 1000M SE +/- 2085000.00, N = 2 992540000 993445000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 32 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 32 - Buffer Length: 256 - Filter Length: 57 a b 300M 600M 900M 1200M 1500M SE +/- 20600000.00, N = 2 1197700000 1185500000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 64 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 32 a b 400M 800M 1200M 1600M 2000M SE +/- 2750000.00, N = 2 1805000000 1825450000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 64 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 57 a b 400M 800M 1200M 1600M 2000M SE +/- 13650000.00, N = 2 2069200000 2076650000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 128 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 128 - Buffer Length: 256 - Filter Length: 32 a b 600M 1200M 1800M 2400M 3000M SE +/- 6350000.00, N = 2 2961100000 2945150000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 128 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 128 - Buffer Length: 256 - Filter Length: 57 a b 500M 1000M 1500M 2000M 2500M SE +/- 13350000.00, N = 2 2519200000 2426350000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 160 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 160 - Buffer Length: 256 - Filter Length: 32 a b 700M 1400M 2100M 2800M 3500M SE +/- 9150000.00, N = 2 3390700000 3381950000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 160 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 160 - Buffer Length: 256 - Filter Length: 57 a b 600M 1200M 1800M 2400M 3000M SE +/- 17250000.00, N = 2 2602300000 2636450000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 32 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 32 - Buffer Length: 256 - Filter Length: 512 a b 90M 180M 270M 360M 450M SE +/- 2775000.00, N = 2 400730000 396265000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 64 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 512 a b 160M 320M 480M 640M 800M SE +/- 1250000.00, N = 2 725840000 730310000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 128 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 128 - Buffer Length: 256 - Filter Length: 512 a b 200M 400M 600M 800M 1000M SE +/- 2180000.00, N = 2 949400000 945190000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 160 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 160 - Buffer Length: 256 - Filter Length: 512 a b 200M 400M 600M 800M 1000M SE +/- 1800000.00, N = 2 1013200000 1011200000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 1 - Buffer Length: 256 - Filter Length: 32 a b 7M 14M 21M 28M 35M SE +/- 0.00, N = 2 SE +/- 0.00, N = 2 32338000 32267000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 1 - Buffer Length: 256 - Filter Length: 57 a b 12M 24M 36M 48M 60M SE +/- 500.00, N = 2 SE +/- 1500.00, N = 2 53918500 53926500 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 1 - Buffer Length: 256 - Filter Length: 512 a b 3M 6M 9M 12M 15M SE +/- 1000.00, N = 2 SE +/- 34000.00, N = 2 13323000 13291000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 16 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 16 - Buffer Length: 256 - Filter Length: 32 a b 110M 220M 330M 440M 550M SE +/- 1500000.00, N = 2 SE +/- 830000.00, N = 2 493660000 498410000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 16 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 16 - Buffer Length: 256 - Filter Length: 57 a b 130M 260M 390M 520M 650M SE +/- 3155000.00, N = 2 SE +/- 11305000.00, N = 2 615105000 623535000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 16 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 16 - Buffer Length: 256 - Filter Length: 512 a b 40M 80M 120M 160M 200M SE +/- 795000.00, N = 2 SE +/- 650000.00, N = 2 201615000 198790000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
srsRAN Project Test: Downlink Processor Benchmark OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.5 Test: Downlink Processor Benchmark a b 120 240 360 480 600 SE +/- 0.70, N = 2 SE +/- 1.25, N = 2 556.5 556.8 1. (CXX) g++ options: -march=native -mfma -O3 -fno-trapping-math -fno-math-errno -lgtest
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Total OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.5 Test: PUSCH Processor Benchmark, Throughput Total a b 2K 4K 6K 8K 10K SE +/- 47.35, N = 2 SE +/- 33.95, N = 2 9800.5 9756.7 1. (CXX) g++ options: -march=native -mfma -O3 -fno-trapping-math -fno-math-errno -lgtest
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Thread OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.5 Test: PUSCH Processor Benchmark, Throughput Thread a b 40 80 120 160 200 SE +/- 1.70, N = 2 SE +/- 0.90, N = 2 164.8 164.7 1. (CXX) g++ options: -march=native -mfma -O3 -fno-trapping-math -fno-math-errno -lgtest
Apache CouchDB Bulk Size: 100 - Inserts: 1000 - Rounds: 30 OpenBenchmarking.org Seconds, Fewer Is Better Apache CouchDB 3.3.2 Bulk Size: 100 - Inserts: 1000 - Rounds: 30 a 20 40 60 80 100 94.83 1. (CXX) g++ options: -std=c++17 -lmozjs-78 -lm -lei -fPIC -MMD
Apache CouchDB Bulk Size: 300 - Inserts: 1000 - Rounds: 30 OpenBenchmarking.org Seconds, Fewer Is Better Apache CouchDB 3.3.2 Bulk Size: 300 - Inserts: 1000 - Rounds: 30 a 30 60 90 120 150 152.46 1. (CXX) g++ options: -std=c++17 -lmozjs-78 -lm -lei -fPIC -MMD
Apache CouchDB Bulk Size: 500 - Inserts: 1000 - Rounds: 30 OpenBenchmarking.org Seconds, Fewer Is Better Apache CouchDB 3.3.2 Bulk Size: 500 - Inserts: 1000 - Rounds: 30 a 200 400 600 800 1000 1090.42 1. (CXX) g++ options: -std=c++17 -lmozjs-78 -lm -lei -fPIC -MMD
Apache IoTDB Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 200 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 200 b a 140K 280K 420K 560K 700K 628202.55 638644.35
Apache IoTDB Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 200 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 200 b a 4 8 12 16 20 18.04 17.54 MAX: 597.99 MAX: 680.16
Apache IoTDB Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 500 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 500 b a 200K 400K 600K 800K 1000K 992909.69 995259.68
Apache IoTDB Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 500 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 500 b a 8 16 24 32 40 36.16 35.99 MAX: 769.5 MAX: 724.8
Apache IoTDB Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 200 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 200 b a 200K 400K 600K 800K 1000K 920435.77 904320.60
Apache IoTDB Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 200 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.1.2 Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 200 b a 4 8 12 16 20 14.42 14.84 MAX: 596.84 MAX: 605.55
Apache IoTDB Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 500 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 500 b a 200K 400K 600K 800K 1000K 1137612.61 1134736.54
Apache IoTDB Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 500 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.1.2 Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 500 b a 8 16 24 32 40 36.82 36.74 MAX: 691.5 MAX: 793.88
Apache IoTDB Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 200 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 200 b a 300K 600K 900K 1200K 1500K 1141859.25 1199743.22
Apache IoTDB Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 200 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.1.2 Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 200 b a 4 8 12 16 20 14.12 13.25 MAX: 878.17 MAX: 896.77
Apache IoTDB Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 500 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 500 b a 300K 600K 900K 1200K 1500K 1372429.58 1343156.56
Apache IoTDB Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 500 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.1.2 Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 500 b a 8 16 24 32 40 32.75 33.38 MAX: 992.49 MAX: 934.86
Apache IoTDB Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 200 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 200 b a 7M 14M 21M 28M 35M 34807016.85 34266143.85
Apache IoTDB Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 200 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 200 b a 10 20 30 40 50 42.19 42.79 MAX: 784.56 MAX: 855.16
Apache IoTDB Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 500 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 500 b a 9M 18M 27M 36M 45M 43021501.40 39562245.22
Apache IoTDB Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 500 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 500 b a 20 40 60 80 100 96.18 109.40 MAX: 1249.92 MAX: 2142.92
libxsmm M N K: 128 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 128 a b 400 800 1200 1600 2000 SE +/- 54.55, N = 2 1055.3 1946.7 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
Phoronix Test Suite v10.8.5