xeon auggy Tests for a future article. 2 x Intel Xeon Platinum 8380 testing with a Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS) and ASPEED on Ubuntu 22.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2308065-NE-XEONAUGGY78&rdt&grr .
xeon auggy Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Vulkan Compiler File-System Screen Resolution a b 2 x Intel Xeon Platinum 8380 @ 3.40GHz (80 Cores / 160 Threads) Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS) Intel Ice Lake IEH 512GB 7682GB INTEL SSDPF2KX076TZ ASPEED VE228 2 x Intel X710 for 10GBASE-T + 2 x Intel E810-C for QSFP Ubuntu 22.10 6.2.0-rc5-phx-dodt (x86_64) GNOME Shell 43.0 X Server 1.21.1.3 1.3.224 GCC 12.2.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate performance (EPP: performance) - CPU Microcode: 0xd000389 Java Details - OpenJDK Runtime Environment (build 11.0.19+7-post-Ubuntu-0ubuntu122.10.1) Python Details - Python 3.10.7 Security Details - dodt: Mitigation of DOITM + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected
xeon auggy build-gcc: Time To Compile couchdb: 500 - 1000 - 30 libxsmm: 128 blender: Barbershop - CPU-Only libxsmm: 256 ospray: particle_volume/pathtracer/real_time ncnn: CPU - FastestDet ncnn: CPU - vision_transformer ncnn: CPU - regnety_400m ncnn: CPU - squeezenet_ssd ncnn: CPU - yolov4-tiny ncnn: CPU - resnet50 ncnn: CPU - alexnet ncnn: CPU - resnet18 ncnn: CPU - vgg16 ncnn: CPU - googlenet ncnn: CPU - blazeface ncnn: CPU - efficientnet-b0 ncnn: CPU - mnasnet ncnn: CPU - shufflenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU - mobilenet vvenc: Bosphorus 4K - Fast ospray: particle_volume/scivis/real_time z3: 2.smt2 couchdb: 300 - 1000 - 30 ospray: particle_volume/ao/real_time blender: Classroom - CPU-Only vvenc: Bosphorus 4K - Faster srsran: PUSCH Processor Benchmark, Throughput Total couchdb: 100 - 1000 - 30 gpaw: Carbon Nanotube vvenc: Bosphorus 1080p - Fast ospray: gravity_spheres_volume/dim_512/scivis/real_time ospray: gravity_spheres_volume/dim_512/ao/real_time heffte: c2c - Stock - double - 512 apache-iotdb: 500 - 1 - 500 apache-iotdb: 500 - 1 - 500 ospray: gravity_spheres_volume/dim_512/pathtracer/real_time heffte: c2c - FFTW - double - 512 blender: Fishy Cat - CPU-Only quantlib: stress-ng: Cloning stress-ng: Pthread stress-ng: Zlib stress-ng: Vector Floating Point stress-ng: Fused Multiply-Add stress-ng: Matrix 3D Math stress-ng: Pipe stress-ng: Wide Vector Math stress-ng: Floating Point stress-ng: AVL Tree stress-ng: Vector Shuffle liquid-dsp: 16 - 256 - 512 liquid-dsp: 16 - 256 - 57 liquid-dsp: 16 - 256 - 32 liquid-dsp: 1 - 256 - 512 liquid-dsp: 1 - 256 - 57 liquid-dsp: 1 - 256 - 32 apache-iotdb: 100 - 100 - 500 apache-iotdb: 100 - 100 - 500 encode-opus: WAV To Opus Encode z3: 1.smt2 apache-iotdb: 200 - 1 - 500 apache-iotdb: 200 - 1 - 500 srsran: Downlink Processor Benchmark apache-iotdb: 500 - 1 - 200 apache-iotdb: 500 - 1 - 200 blender: BMW27 - CPU-Only vvenc: Bosphorus 1080p - Faster liquid-dsp: 160 - 256 - 512 liquid-dsp: 128 - 256 - 512 liquid-dsp: 64 - 256 - 512 liquid-dsp: 32 - 256 - 512 liquid-dsp: 160 - 256 - 57 liquid-dsp: 128 - 256 - 57 liquid-dsp: 160 - 256 - 32 liquid-dsp: 128 - 256 - 32 liquid-dsp: 64 - 256 - 57 liquid-dsp: 64 - 256 - 32 liquid-dsp: 32 - 256 - 57 liquid-dsp: 32 - 256 - 32 heffte: r2c - FFTW - double - 512 heffte: c2c - Stock - float - 512 heffte: r2c - Stock - double - 512 heffte: c2c - FFTW - float - 512 apache-iotdb: 100 - 100 - 200 apache-iotdb: 100 - 100 - 200 embree: Pathtracer - Asian Dragon Obj apache-iotdb: 100 - 1 - 500 apache-iotdb: 100 - 1 - 500 embree: Pathtracer ISPC - Asian Dragon Obj srsran: PUSCH Processor Benchmark, Throughput Thread apache-iotdb: 200 - 1 - 200 apache-iotdb: 200 - 1 - 200 apache-iotdb: 100 - 1 - 200 apache-iotdb: 100 - 1 - 200 dav1d: Chimera 1080p 10-bit dav1d: Chimera 1080p heffte: r2c - FFTW - float - 512 heffte: r2c - Stock - float - 512 oidn: RTLightmap.hdr.4096x4096 - CPU-Only dav1d: Summer Nature 4K remhos: Sample Remap Example embree: Pathtracer - Asian Dragon libxsmm: 64 libxsmm: 32 embree: Pathtracer - Crown embree: Pathtracer ISPC - Asian Dragon embree: Pathtracer ISPC - Crown oidn: RT.hdr_alb_nrm.3840x2160 - CPU-Only oidn: RT.ldr_alb_nrm.3840x2160 - CPU-Only heffte: c2c - FFTW - double - 256 dav1d: Summer Nature 1080p heffte: c2c - Stock - double - 256 heffte: r2c - FFTW - double - 256 heffte: c2c - FFTW - float - 256 heffte: r2c - Stock - double - 256 heffte: c2c - Stock - float - 256 heffte: r2c - FFTW - float - 256 heffte: r2c - Stock - float - 256 heffte: c2c - Stock - double - 128 heffte: c2c - FFTW - double - 128 heffte: c2c - Stock - float - 128 heffte: r2c - Stock - double - 128 heffte: r2c - FFTW - double - 128 heffte: c2c - FFTW - float - 128 heffte: r2c - Stock - float - 128 heffte: r2c - FFTW - float - 128 a b 957.946 1090.424 1055.3 239.55 599.8 151.138 9.62 46.50 38.20 15.78 24.77 17.57 5.65 10.31 26.27 17.06 4.49 11.48 7.61 9.82 8.71 7.91 16.06 5.672 24.3592 87.998 152.456 24.637 62.35 10.284 9800.5 94.834 45.824 15.708 20.811 21.2056 47.2801 33.38 1343156.56 22.6977 49.4363 30.59 2622.9 16195.03 92131.54 6879.86 132479.08 181083180.47 12743.81 40500166.81 2195391.41 21134.81 610.69 48054.48 201615000 615105000 493660000 13323000 53918500 32338000 109.4 39562245.22 36.736 25.713 36.74 1134736.54 556.5 13.25 1199743.22 23.69 29.077 1013200000 949400000 725840000 400730000 2602300000 2519200000 3390700000 2961100000 2069200000 1805000000 1197700000 992540000 90.5745 93.3349 94.2637 94.8348 42.79 34266143.85 76.9608 35.99 995259.68 89.8447 164.8 14.84 904320.6 17.54 638644.35 476.82 516.17 170.906 176.630 1.47 282.53 12.195 85.2423 1219.9 633.2 72.0419 104.4148 87.9306 3.04 3.05 45.8509 699.97 46.6636 93.0098 102.278 101.938 101.8 222.215 236.666 69.4816 94.4544 107.452 117.006 156.224 159.344 185.453 199.103 956.127 1946.7 239.03 592.5 151.273 10.01 45.56 39.37 16.13 24.48 18.15 5.44 9.63 26.19 16.72 4.61 11.64 7.43 9.75 8.77 7.94 15.90 5.717 24.7827 87.178 24.6207 62.51 10.430 9756.7 45.636 15.723 20.4818 20.9459 47.3856 32.75 1372429.58 22.5752 48.7210 30.77 2607.6 13172.81 90361.7 6880.22 131100.25 181314757.42 12742.70 49325396.97 2196242.21 21133.02 610.83 48076.78 198790000 623535000 498410000 13291000 53926500 32267000 96.18 43021501.4 36.726 25.251 36.82 1137612.61 556.8 14.12 1141859.25 23.62 29.176 1011200000 945190000 730310000 396265000 2636450000 2426350000 3381950000 2945150000 2076650000 1825450000 1185500000 993445000 90.2388 92.2568 92.7173 94.3442 42.19 34807016.85 77.2633 36.16 992909.69 90.0132 164.7 14.42 920435.77 18.04 628202.55 476.77 516.50 171.134 174.114 1.46 282.65 12.365 85.1315 1216.0 639.3 70.7905 104.5539 87.9319 3.01 3.03 46.4607 699.09 46.5052 92.8161 98.8315 102.426 104.345 230.541 236.119 67.9483 91.9039 107.952 116.839 148.177 158.711 182.03 199.333 OpenBenchmarking.org
Timed GCC Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed GCC Compilation 13.2 Time To Compile a b 200 400 600 800 1000 SE +/- 1.97, N = 2 957.95 956.13
Apache CouchDB Bulk Size: 500 - Inserts: 1000 - Rounds: 30 OpenBenchmarking.org Seconds, Fewer Is Better Apache CouchDB 3.3.2 Bulk Size: 500 - Inserts: 1000 - Rounds: 30 a 200 400 600 800 1000 1090.42 1. (CXX) g++ options: -std=c++17 -lmozjs-78 -lm -lei -fPIC -MMD
libxsmm M N K: 128 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 128 a b 400 800 1200 1600 2000 SE +/- 54.55, N = 2 1055.3 1946.7 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Barbershop - Compute: CPU-Only a b 50 100 150 200 250 SE +/- 0.32, N = 2 SE +/- 1.32, N = 2 239.55 239.03
libxsmm M N K: 256 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 256 a b 130 260 390 520 650 SE +/- 2.65, N = 2 599.8 592.5 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
OSPRay Benchmark: particle_volume/pathtracer/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: particle_volume/pathtracer/real_time a b 30 60 90 120 150 SE +/- 0.63, N = 2 151.14 151.27
NCNN Target: CPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: FastestDet a b 3 6 9 12 15 SE +/- 0.05, N = 2 SE +/- 0.28, N = 2 9.62 10.01 MIN: 9.35 / MAX: 10.52 MIN: 9.4 / MAX: 59.43 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: vision_transformer a b 11 22 33 44 55 SE +/- 2.46, N = 2 SE +/- 1.19, N = 2 46.50 45.56 MIN: 42.6 / MAX: 72.28 MIN: 43.24 / MAX: 70.59 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: regnety_400m a b 9 18 27 36 45 SE +/- 0.86, N = 2 SE +/- 1.10, N = 2 38.20 39.37 MIN: 36.18 / MAX: 62.76 MIN: 37.07 / MAX: 103.97 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: squeezenet_ssd a b 4 8 12 16 20 SE +/- 0.07, N = 2 SE +/- 0.41, N = 2 15.78 16.13 MIN: 15.4 / MAX: 43.08 MIN: 15.35 / MAX: 39.36 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: yolov4-tiny a b 6 12 18 24 30 SE +/- 0.65, N = 2 SE +/- 0.64, N = 2 24.77 24.48 MIN: 22.68 / MAX: 208.18 MIN: 22.66 / MAX: 47.54 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: resnet50 a b 4 8 12 16 20 SE +/- 0.36, N = 2 SE +/- 0.83, N = 2 17.57 18.15 MIN: 16.92 / MAX: 18.88 MIN: 16.98 / MAX: 42.81 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: alexnet a b 1.2713 2.5426 3.8139 5.0852 6.3565 SE +/- 0.43, N = 2 SE +/- 0.22, N = 2 5.65 5.44 MIN: 5.03 / MAX: 6.71 MIN: 5.08 / MAX: 7.52 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: resnet18 a b 3 6 9 12 15 SE +/- 1.04, N = 2 SE +/- 0.31, N = 2 10.31 9.63 MIN: 9.03 / MAX: 33.3 MIN: 9.16 / MAX: 26.03 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: vgg16 a b 6 12 18 24 30 SE +/- 0.84, N = 2 SE +/- 0.34, N = 2 26.27 26.19 MIN: 24.05 / MAX: 301.35 MIN: 24.19 / MAX: 341.6 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: googlenet a b 4 8 12 16 20 SE +/- 1.05, N = 2 SE +/- 0.37, N = 2 17.06 16.72 MIN: 15.5 / MAX: 66.12 MIN: 15.67 / MAX: 100.92 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: blazeface a b 1.0373 2.0746 3.1119 4.1492 5.1865 SE +/- 0.09, N = 2 SE +/- 0.01, N = 2 4.49 4.61 MIN: 4.31 / MAX: 5.13 MIN: 4.49 / MAX: 5.33 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: efficientnet-b0 a b 3 6 9 12 15 SE +/- 0.23, N = 2 SE +/- 0.26, N = 2 11.48 11.64 MIN: 10.9 / MAX: 56.34 MIN: 10.85 / MAX: 37.54 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: mnasnet a b 2 4 6 8 10 SE +/- 0.07, N = 2 SE +/- 0.01, N = 2 7.61 7.43 MIN: 7.33 / MAX: 43.29 MIN: 7.16 / MAX: 15.37 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: shufflenet-v2 a b 3 6 9 12 15 SE +/- 0.03, N = 2 SE +/- 0.06, N = 2 9.82 9.75 MIN: 9.6 / MAX: 12.61 MIN: 9.56 / MAX: 13.68 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3 - Model: mobilenet-v3 a b 2 4 6 8 10 SE +/- 0.14, N = 2 SE +/- 0.03, N = 2 8.71 8.77 MIN: 8.43 / MAX: 9.8 MIN: 8.59 / MAX: 32.78 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v2-v2 - Model: mobilenet-v2 a b 2 4 6 8 10 SE +/- 0.13, N = 2 SE +/- 0.02, N = 2 7.91 7.94 MIN: 7.68 / MAX: 9.6 MIN: 7.81 / MAX: 10.99 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: mobilenet a b 4 8 12 16 20 SE +/- 0.82, N = 2 SE +/- 0.28, N = 2 16.06 15.90 MIN: 14.92 / MAX: 25.43 MIN: 15.2 / MAX: 39.56 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
VVenC Video Input: Bosphorus 4K - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.9 Video Input: Bosphorus 4K - Video Preset: Fast a b 1.2863 2.5726 3.8589 5.1452 6.4315 SE +/- 0.019, N = 2 SE +/- 0.063, N = 2 5.672 5.717 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
OSPRay Benchmark: particle_volume/scivis/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: particle_volume/scivis/real_time a b 6 12 18 24 30 SE +/- 0.03, N = 2 24.36 24.78
Z3 Theorem Prover SMT File: 2.smt2 OpenBenchmarking.org Seconds, Fewer Is Better Z3 Theorem Prover 4.12.1 SMT File: 2.smt2 a b 20 40 60 80 100 SE +/- 0.02, N = 2 SE +/- 0.05, N = 2 88.00 87.18 1. (CXX) g++ options: -lpthread -std=c++17 -fvisibility=hidden -mfpmath=sse -msse -msse2 -O3 -fPIC
Apache CouchDB Bulk Size: 300 - Inserts: 1000 - Rounds: 30 OpenBenchmarking.org Seconds, Fewer Is Better Apache CouchDB 3.3.2 Bulk Size: 300 - Inserts: 1000 - Rounds: 30 a 30 60 90 120 150 152.46 1. (CXX) g++ options: -std=c++17 -lmozjs-78 -lm -lei -fPIC -MMD
OSPRay Benchmark: particle_volume/ao/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: particle_volume/ao/real_time a b 6 12 18 24 30 SE +/- 0.09, N = 2 24.64 24.62
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Classroom - Compute: CPU-Only a b 14 28 42 56 70 SE +/- 0.04, N = 2 SE +/- 0.19, N = 2 62.35 62.51
VVenC Video Input: Bosphorus 4K - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.9 Video Input: Bosphorus 4K - Video Preset: Faster a b 3 6 9 12 15 SE +/- 0.03, N = 2 SE +/- 0.10, N = 2 10.28 10.43 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Total OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.5 Test: PUSCH Processor Benchmark, Throughput Total a b 2K 4K 6K 8K 10K SE +/- 47.35, N = 2 SE +/- 33.95, N = 2 9800.5 9756.7 1. (CXX) g++ options: -march=native -mfma -O3 -fno-trapping-math -fno-math-errno -lgtest
Apache CouchDB Bulk Size: 100 - Inserts: 1000 - Rounds: 30 OpenBenchmarking.org Seconds, Fewer Is Better Apache CouchDB 3.3.2 Bulk Size: 100 - Inserts: 1000 - Rounds: 30 a 20 40 60 80 100 94.83 1. (CXX) g++ options: -std=c++17 -lmozjs-78 -lm -lei -fPIC -MMD
GPAW Input: Carbon Nanotube OpenBenchmarking.org Seconds, Fewer Is Better GPAW 23.6 Input: Carbon Nanotube a b 10 20 30 40 50 SE +/- 0.02, N = 2 SE +/- 0.03, N = 2 45.82 45.64 1. (CC) gcc options: -shared -fwrapv -O2 -lxc -lblas -lmpi
VVenC Video Input: Bosphorus 1080p - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.9 Video Input: Bosphorus 1080p - Video Preset: Fast a b 4 8 12 16 20 SE +/- 0.06, N = 2 SE +/- 0.04, N = 2 15.71 15.72 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
OSPRay Benchmark: gravity_spheres_volume/dim_512/scivis/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: gravity_spheres_volume/dim_512/scivis/real_time a b 5 10 15 20 25 SE +/- 0.05, N = 2 20.81 20.48
OSPRay Benchmark: gravity_spheres_volume/dim_512/ao/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: gravity_spheres_volume/dim_512/ao/real_time a b 5 10 15 20 25 SE +/- 0.22, N = 2 21.21 20.95
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double - X Y Z: 512 a b 11 22 33 44 55 SE +/- 0.14, N = 2 SE +/- 0.07, N = 2 47.28 47.39 1. (CXX) g++ options: -O3
Apache IoTDB Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 500 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.1.2 Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 500 a b 8 16 24 32 40 33.38 32.75 MAX: 934.86 MAX: 992.49
Apache IoTDB Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 500 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 500 a b 300K 600K 900K 1200K 1500K 1343156.56 1372429.58
OSPRay Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time a b 5 10 15 20 25 SE +/- 0.00, N = 2 22.70 22.58
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512 a b 11 22 33 44 55 SE +/- 0.29, N = 2 SE +/- 0.39, N = 2 49.44 48.72 1. (CXX) g++ options: -O3
Blender Blend File: Fishy Cat - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Fishy Cat - Compute: CPU-Only a b 7 14 21 28 35 SE +/- 0.01, N = 2 SE +/- 0.02, N = 2 30.59 30.77
QuantLib OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.30 a b 600 1200 1800 2400 3000 SE +/- 2.00, N = 2 SE +/- 1.80, N = 2 2622.9 2607.6 1. (CXX) g++ options: -O3 -march=native -fPIE -pie
Stress-NG Test: Cloning OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Cloning a b 3K 6K 9K 12K 15K SE +/- 3270.70, N = 2 SE +/- 654.66, N = 2 16195.03 13172.81 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Pthread OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Pthread a b 20K 40K 60K 80K 100K SE +/- 279.15, N = 2 SE +/- 894.90, N = 2 92131.54 90361.70 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Zlib OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Zlib a b 1500 3000 4500 6000 7500 SE +/- 8.83, N = 2 SE +/- 4.39, N = 2 6879.86 6880.22 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Vector Floating Point OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Vector Floating Point a b 30K 60K 90K 120K 150K SE +/- 872.22, N = 2 SE +/- 235.00, N = 2 132479.08 131100.25 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Fused Multiply-Add OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Fused Multiply-Add a b 40M 80M 120M 160M 200M SE +/- 118686.48, N = 2 SE +/- 92010.25, N = 2 181083180.47 181314757.42 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Matrix 3D Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Matrix 3D Math a b 3K 6K 9K 12K 15K SE +/- 9.80, N = 2 SE +/- 5.06, N = 2 12743.81 12742.70 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Pipe OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Pipe a b 11M 22M 33M 44M 55M SE +/- 2523742.74, N = 2 SE +/- 6369572.71, N = 2 40500166.81 49325396.97 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Wide Vector Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Wide Vector Math a b 500K 1000K 1500K 2000K 2500K SE +/- 1200.16, N = 2 SE +/- 497.69, N = 2 2195391.41 2196242.21 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Floating Point OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Floating Point a b 5K 10K 15K 20K 25K SE +/- 6.86, N = 2 SE +/- 9.14, N = 2 21134.81 21133.02 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: AVL Tree OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: AVL Tree a b 130 260 390 520 650 SE +/- 0.08, N = 2 SE +/- 0.44, N = 2 610.69 610.83 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Vector Shuffle OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Vector Shuffle a b 10K 20K 30K 40K 50K SE +/- 1.26, N = 2 SE +/- 42.22, N = 2 48054.48 48076.78 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Liquid-DSP Threads: 16 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 16 - Buffer Length: 256 - Filter Length: 512 a b 40M 80M 120M 160M 200M SE +/- 795000.00, N = 2 SE +/- 650000.00, N = 2 201615000 198790000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 16 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 16 - Buffer Length: 256 - Filter Length: 57 a b 130M 260M 390M 520M 650M SE +/- 3155000.00, N = 2 SE +/- 11305000.00, N = 2 615105000 623535000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 16 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 16 - Buffer Length: 256 - Filter Length: 32 a b 110M 220M 330M 440M 550M SE +/- 1500000.00, N = 2 SE +/- 830000.00, N = 2 493660000 498410000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 1 - Buffer Length: 256 - Filter Length: 512 a b 3M 6M 9M 12M 15M SE +/- 1000.00, N = 2 SE +/- 34000.00, N = 2 13323000 13291000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 1 - Buffer Length: 256 - Filter Length: 57 a b 12M 24M 36M 48M 60M SE +/- 500.00, N = 2 SE +/- 1500.00, N = 2 53918500 53926500 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 1 - Buffer Length: 256 - Filter Length: 32 a b 7M 14M 21M 28M 35M SE +/- 0.00, N = 2 SE +/- 0.00, N = 2 32338000 32267000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Apache IoTDB Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 500 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 500 a b 20 40 60 80 100 109.40 96.18 MAX: 2142.92 MAX: 1249.92
Apache IoTDB Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 500 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 500 a b 9M 18M 27M 36M 45M 39562245.22 43021501.40
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.4 WAV To Opus Encode a b 8 16 24 32 40 SE +/- 0.01, N = 2 36.74 36.73 1. (CXX) g++ options: -O3 -fvisibility=hidden -logg -lm
Z3 Theorem Prover SMT File: 1.smt2 OpenBenchmarking.org Seconds, Fewer Is Better Z3 Theorem Prover 4.12.1 SMT File: 1.smt2 a b 6 12 18 24 30 SE +/- 0.03, N = 2 SE +/- 0.07, N = 2 25.71 25.25 1. (CXX) g++ options: -lpthread -std=c++17 -fvisibility=hidden -mfpmath=sse -msse -msse2 -O3 -fPIC
Apache IoTDB Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 500 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.1.2 Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 500 a b 8 16 24 32 40 36.74 36.82 MAX: 793.88 MAX: 691.5
Apache IoTDB Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 500 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 500 a b 200K 400K 600K 800K 1000K 1134736.54 1137612.61
srsRAN Project Test: Downlink Processor Benchmark OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.5 Test: Downlink Processor Benchmark a b 120 240 360 480 600 SE +/- 0.70, N = 2 SE +/- 1.25, N = 2 556.5 556.8 1. (CXX) g++ options: -march=native -mfma -O3 -fno-trapping-math -fno-math-errno -lgtest
Apache IoTDB Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 200 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.1.2 Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 200 a b 4 8 12 16 20 13.25 14.12 MAX: 896.77 MAX: 878.17
Apache IoTDB Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 200 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 200 a b 300K 600K 900K 1200K 1500K 1199743.22 1141859.25
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: BMW27 - Compute: CPU-Only a b 6 12 18 24 30 SE +/- 0.09, N = 2 SE +/- 0.06, N = 2 23.69 23.62
VVenC Video Input: Bosphorus 1080p - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.9 Video Input: Bosphorus 1080p - Video Preset: Faster a b 7 14 21 28 35 SE +/- 0.37, N = 2 SE +/- 0.23, N = 2 29.08 29.18 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
Liquid-DSP Threads: 160 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 160 - Buffer Length: 256 - Filter Length: 512 a b 200M 400M 600M 800M 1000M SE +/- 1800000.00, N = 2 1013200000 1011200000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 128 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 128 - Buffer Length: 256 - Filter Length: 512 a b 200M 400M 600M 800M 1000M SE +/- 2180000.00, N = 2 949400000 945190000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 64 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 512 a b 160M 320M 480M 640M 800M SE +/- 1250000.00, N = 2 725840000 730310000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 32 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 32 - Buffer Length: 256 - Filter Length: 512 a b 90M 180M 270M 360M 450M SE +/- 2775000.00, N = 2 400730000 396265000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 160 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 160 - Buffer Length: 256 - Filter Length: 57 a b 600M 1200M 1800M 2400M 3000M SE +/- 17250000.00, N = 2 2602300000 2636450000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 128 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 128 - Buffer Length: 256 - Filter Length: 57 a b 500M 1000M 1500M 2000M 2500M SE +/- 13350000.00, N = 2 2519200000 2426350000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 160 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 160 - Buffer Length: 256 - Filter Length: 32 a b 700M 1400M 2100M 2800M 3500M SE +/- 9150000.00, N = 2 3390700000 3381950000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 128 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 128 - Buffer Length: 256 - Filter Length: 32 a b 600M 1200M 1800M 2400M 3000M SE +/- 6350000.00, N = 2 2961100000 2945150000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 64 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 57 a b 400M 800M 1200M 1600M 2000M SE +/- 13650000.00, N = 2 2069200000 2076650000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 64 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 32 a b 400M 800M 1200M 1600M 2000M SE +/- 2750000.00, N = 2 1805000000 1825450000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 32 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 32 - Buffer Length: 256 - Filter Length: 57 a b 300M 600M 900M 1200M 1500M SE +/- 20600000.00, N = 2 1197700000 1185500000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 32 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 32 - Buffer Length: 256 - Filter Length: 32 a b 200M 400M 600M 800M 1000M SE +/- 2085000.00, N = 2 992540000 993445000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512 a b 20 40 60 80 100 SE +/- 1.08, N = 2 SE +/- 1.17, N = 2 90.57 90.24 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: float - X Y Z: 512 a b 20 40 60 80 100 SE +/- 0.40, N = 2 SE +/- 0.24, N = 2 93.33 92.26 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double - X Y Z: 512 a b 20 40 60 80 100 SE +/- 0.13, N = 2 SE +/- 0.73, N = 2 94.26 92.72 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512 a b 20 40 60 80 100 SE +/- 1.11, N = 2 SE +/- 0.95, N = 2 94.83 94.34 1. (CXX) g++ options: -O3
Apache IoTDB Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 200 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 200 a b 10 20 30 40 50 42.79 42.19 MAX: 855.16 MAX: 784.56
Apache IoTDB Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 200 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 200 a b 7M 14M 21M 28M 35M 34266143.85 34807016.85
Embree Binary: Pathtracer - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer - Model: Asian Dragon Obj a b 20 40 60 80 100 SE +/- 0.03, N = 2 SE +/- 0.03, N = 2 76.96 77.26 MIN: 75.53 / MAX: 82.14 MIN: 75.78 / MAX: 81.08
Apache IoTDB Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 500 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 500 a b 8 16 24 32 40 35.99 36.16 MAX: 724.8 MAX: 769.5
Apache IoTDB Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 500 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 500 a b 200K 400K 600K 800K 1000K 995259.68 992909.69
Embree Binary: Pathtracer ISPC - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer ISPC - Model: Asian Dragon Obj a b 20 40 60 80 100 SE +/- 0.06, N = 2 SE +/- 0.05, N = 2 89.84 90.01 MIN: 87.68 / MAX: 94.71 MIN: 87.6 / MAX: 94.43
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Thread OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.5 Test: PUSCH Processor Benchmark, Throughput Thread a b 40 80 120 160 200 SE +/- 1.70, N = 2 SE +/- 0.90, N = 2 164.8 164.7 1. (CXX) g++ options: -march=native -mfma -O3 -fno-trapping-math -fno-math-errno -lgtest
Apache IoTDB Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 200 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.1.2 Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 200 a b 4 8 12 16 20 14.84 14.42 MAX: 605.55 MAX: 596.84
Apache IoTDB Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 200 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 200 a b 200K 400K 600K 800K 1000K 904320.60 920435.77
Apache IoTDB Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 200 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 200 a b 4 8 12 16 20 17.54 18.04 MAX: 680.16 MAX: 597.99
Apache IoTDB Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 200 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 200 a b 140K 280K 420K 560K 700K 638644.35 628202.55
dav1d Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better dav1d 1.2.1 Video Input: Chimera 1080p 10-bit a b 100 200 300 400 500 SE +/- 0.41, N = 2 476.82 476.77 1. (CC) gcc options: -pthread -lm
dav1d Video Input: Chimera 1080p OpenBenchmarking.org FPS, More Is Better dav1d 1.2.1 Video Input: Chimera 1080p a b 110 220 330 440 550 SE +/- 0.06, N = 2 516.17 516.50 1. (CC) gcc options: -pthread -lm
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512 a b 40 80 120 160 200 SE +/- 1.25, N = 2 SE +/- 1.39, N = 2 170.91 171.13 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: float - X Y Z: 512 a b 40 80 120 160 200 SE +/- 0.68, N = 2 SE +/- 0.12, N = 2 176.63 174.11 1. (CXX) g++ options: -O3
Intel Open Image Denoise Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.0 Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only a b 0.3308 0.6616 0.9924 1.3232 1.654 SE +/- 0.00, N = 2 1.47 1.46
dav1d Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better dav1d 1.2.1 Video Input: Summer Nature 4K a b 60 120 180 240 300 SE +/- 0.08, N = 2 282.53 282.65 1. (CC) gcc options: -pthread -lm
Remhos Test: Sample Remap Example OpenBenchmarking.org Seconds, Fewer Is Better Remhos 1.0 Test: Sample Remap Example a b 3 6 9 12 15 SE +/- 0.10, N = 2 12.20 12.37 1. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi
Embree Binary: Pathtracer - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer - Model: Asian Dragon a b 20 40 60 80 100 SE +/- 0.04, N = 2 SE +/- 0.14, N = 2 85.24 85.13 MIN: 83.75 / MAX: 89.99 MIN: 83.65 / MAX: 90.45
libxsmm M N K: 64 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 64 a b 300 600 900 1200 1500 SE +/- 1.25, N = 2 1219.9 1216.0 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
libxsmm M N K: 32 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 32 a b 140 280 420 560 700 SE +/- 2.35, N = 2 633.2 639.3 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
Embree Binary: Pathtracer - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer - Model: Crown a b 16 32 48 64 80 SE +/- 0.12, N = 2 72.04 70.79 MIN: 68.2 / MAX: 79.55 MIN: 67 / MAX: 79.71
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer ISPC - Model: Asian Dragon a b 20 40 60 80 100 SE +/- 0.32, N = 2 SE +/- 0.24, N = 2 104.41 104.55 MIN: 101.88 / MAX: 109.22 MIN: 102.2 / MAX: 108.91
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer ISPC - Model: Crown a b 20 40 60 80 100 SE +/- 0.10, N = 2 87.93 87.93 MIN: 85.27 / MAX: 92.58 MIN: 84.73 / MAX: 92.37
Intel Open Image Denoise Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.0 Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only a b 0.684 1.368 2.052 2.736 3.42 SE +/- 0.00, N = 2 3.04 3.01
Intel Open Image Denoise Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.0 Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only a b 0.6863 1.3726 2.0589 2.7452 3.4315 SE +/- 0.01, N = 2 3.05 3.03
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256 a b 11 22 33 44 55 SE +/- 0.55, N = 2 45.85 46.46 1. (CXX) g++ options: -O3
dav1d Video Input: Summer Nature 1080p OpenBenchmarking.org FPS, More Is Better dav1d 1.2.1 Video Input: Summer Nature 1080p a b 150 300 450 600 750 SE +/- 0.50, N = 2 699.97 699.09 1. (CC) gcc options: -pthread -lm
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double - X Y Z: 256 a b 11 22 33 44 55 SE +/- 0.08, N = 2 46.66 46.51 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256 a b 20 40 60 80 100 SE +/- 1.52, N = 2 93.01 92.82 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256 a b 20 40 60 80 100 SE +/- 0.00, N = 2 102.28 98.83 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double - X Y Z: 256 a b 20 40 60 80 100 SE +/- 0.72, N = 2 101.94 102.43 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: float - X Y Z: 256 a b 20 40 60 80 100 SE +/- 0.34, N = 2 101.80 104.35 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256 a b 50 100 150 200 250 SE +/- 3.31, N = 2 222.22 230.54 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: float - X Y Z: 256 a b 50 100 150 200 250 SE +/- 2.75, N = 2 236.67 236.12 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double - X Y Z: 128 a b 15 30 45 60 75 SE +/- 1.01, N = 2 69.48 67.95 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128 a b 20 40 60 80 100 SE +/- 1.46, N = 2 94.45 91.90 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: float - X Y Z: 128 a b 20 40 60 80 100 SE +/- 1.54, N = 2 107.45 107.95 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double - X Y Z: 128 a b 30 60 90 120 150 SE +/- 4.04, N = 2 117.01 116.84 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128 a b 30 60 90 120 150 SE +/- 2.45, N = 2 156.22 148.18 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128 a b 40 80 120 160 200 SE +/- 0.21, N = 2 159.34 158.71 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: float - X Y Z: 128 a b 40 80 120 160 200 SE +/- 0.86, N = 2 185.45 182.03 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128 a b 40 80 120 160 200 SE +/- 0.39, N = 2 199.10 199.33 1. (CXX) g++ options: -O3
Phoronix Test Suite v10.8.5