xeon auggy Tests for a future article. 2 x Intel Xeon Platinum 8380 testing with a Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS) and ASPEED on Ubuntu 22.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2308065-NE-XEONAUGGY78&grs&sor .
xeon auggy Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Vulkan Compiler File-System Screen Resolution a b 2 x Intel Xeon Platinum 8380 @ 3.40GHz (80 Cores / 160 Threads) Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS) Intel Ice Lake IEH 512GB 7682GB INTEL SSDPF2KX076TZ ASPEED VE228 2 x Intel X710 for 10GBASE-T + 2 x Intel E810-C for QSFP Ubuntu 22.10 6.2.0-rc5-phx-dodt (x86_64) GNOME Shell 43.0 X Server 1.21.1.3 1.3.224 GCC 12.2.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate performance (EPP: performance) - CPU Microcode: 0xd000389 Java Details - OpenJDK Runtime Environment (build 11.0.19+7-post-Ubuntu-0ubuntu122.10.1) Python Details - Python 3.10.7 Security Details - dodt: Mitigation of DOITM + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected
xeon auggy libxsmm: 128 stress-ng: Cloning stress-ng: Pipe apache-iotdb: 100 - 100 - 500 apache-iotdb: 100 - 100 - 500 ncnn: CPU - resnet18 apache-iotdb: 500 - 1 - 200 heffte: r2c - FFTW - double - 128 apache-iotdb: 500 - 1 - 200 ncnn: CPU - FastestDet ncnn: CPU - alexnet liquid-dsp: 128 - 256 - 57 heffte: r2c - FFTW - float - 256 heffte: c2c - FFTW - float - 256 ncnn: CPU - resnet50 ncnn: CPU - regnety_400m apache-iotdb: 200 - 1 - 200 apache-iotdb: 100 - 1 - 200 heffte: c2c - FFTW - double - 128 ncnn: CPU - blazeface heffte: c2c - Stock - float - 256 ncnn: CPU - mnasnet heffte: c2c - Stock - double - 128 ncnn: CPU - squeezenet_ssd apache-iotdb: 500 - 1 - 500 ncnn: CPU - vision_transformer ncnn: CPU - googlenet stress-ng: Pthread apache-iotdb: 500 - 1 - 500 heffte: r2c - Stock - float - 128 z3: 1.smt2 apache-iotdb: 200 - 1 - 200 embree: Pathtracer - Crown ospray: particle_volume/scivis/real_time heffte: r2c - Stock - double - 512 apache-iotdb: 100 - 1 - 200 ospray: gravity_spheres_volume/dim_512/scivis/real_time apache-iotdb: 100 - 100 - 200 heffte: c2c - FFTW - double - 512 heffte: r2c - Stock - float - 512 apache-iotdb: 100 - 100 - 200 liquid-dsp: 16 - 256 - 512 vvenc: Bosphorus 4K - Faster remhos: Sample Remap Example ncnn: CPU - efficientnet-b0 liquid-dsp: 16 - 256 - 57 heffte: c2c - FFTW - double - 256 liquid-dsp: 160 - 256 - 57 ospray: gravity_spheres_volume/dim_512/ao/real_time libxsmm: 256 ncnn: CPU - yolov4-tiny heffte: c2c - Stock - float - 512 liquid-dsp: 64 - 256 - 32 liquid-dsp: 32 - 256 - 512 stress-ng: Vector Floating Point liquid-dsp: 32 - 256 - 57 ncnn: CPU - mobilenet oidn: RT.hdr_alb_nrm.3840x2160 - CPU-Only libxsmm: 32 liquid-dsp: 16 - 256 - 32 z3: 2.smt2 vvenc: Bosphorus 4K - Fast ncnn: CPU - shufflenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 oidn: RTLightmap.hdr.4096x4096 - CPU-Only oidn: RT.ldr_alb_nrm.3840x2160 - CPU-Only liquid-dsp: 64 - 256 - 512 blender: Fishy Cat - CPU-Only quantlib: ospray: gravity_spheres_volume/dim_512/pathtracer/real_time liquid-dsp: 128 - 256 - 32 heffte: c2c - FFTW - float - 512 heffte: r2c - Stock - double - 256 apache-iotdb: 100 - 1 - 500 heffte: c2c - Stock - float - 128 srsran: PUSCH Processor Benchmark, Throughput Total liquid-dsp: 128 - 256 - 512 gpaw: Carbon Nanotube heffte: c2c - FFTW - float - 128 embree: Pathtracer - Asian Dragon Obj ncnn: CPU-v2-v2 - mobilenet-v2 heffte: r2c - FFTW - double - 512 liquid-dsp: 64 - 256 - 57 heffte: c2c - Stock - double - 256 vvenc: Bosphorus 1080p - Faster libxsmm: 64 ncnn: CPU - vgg16 blender: BMW27 - CPU-Only liquid-dsp: 160 - 256 - 32 blender: Classroom - CPU-Only apache-iotdb: 200 - 1 - 500 liquid-dsp: 1 - 256 - 512 apache-iotdb: 100 - 1 - 500 heffte: r2c - Stock - float - 256 heffte: c2c - Stock - double - 512 liquid-dsp: 1 - 256 - 32 apache-iotdb: 200 - 1 - 500 blender: Barbershop - CPU-Only heffte: r2c - FFTW - double - 256 liquid-dsp: 160 - 256 - 512 build-gcc: Time To Compile embree: Pathtracer ISPC - Asian Dragon Obj heffte: r2c - Stock - double - 128 heffte: r2c - FFTW - float - 512 embree: Pathtracer ISPC - Asian Dragon embree: Pathtracer - Asian Dragon stress-ng: Fused Multiply-Add dav1d: Summer Nature 1080p heffte: r2c - FFTW - float - 128 vvenc: Bosphorus 1080p - Fast liquid-dsp: 32 - 256 - 32 ospray: particle_volume/pathtracer/real_time ospray: particle_volume/ao/real_time dav1d: Chimera 1080p srsran: PUSCH Processor Benchmark, Throughput Thread srsran: Downlink Processor Benchmark stress-ng: Vector Shuffle dav1d: Summer Nature 4K stress-ng: Wide Vector Math encode-opus: WAV To Opus Encode stress-ng: AVL Tree liquid-dsp: 1 - 256 - 57 dav1d: Chimera 1080p 10-bit stress-ng: Matrix 3D Math stress-ng: Floating Point stress-ng: Zlib embree: Pathtracer ISPC - Crown couchdb: 500 - 1000 - 30 couchdb: 300 - 1000 - 30 couchdb: 100 - 1000 - 30 apache-iotdb: 100 - 1 - 200 a b 1055.3 16195.03 40500166.81 109.4 39562245.22 10.31 13.25 156.224 1199743.22 9.62 5.65 2519200000 222.215 102.278 17.57 38.20 14.84 17.54 94.4544 4.49 101.8 7.61 69.4816 15.78 1343156.56 46.50 17.06 92131.54 33.38 185.453 25.713 904320.6 72.0419 24.3592 94.2637 638644.35 20.811 34266143.85 49.4363 176.630 42.79 201615000 10.284 12.195 11.48 615105000 45.8509 2602300000 21.2056 599.8 24.77 93.3349 1805000000 400730000 132479.08 1197700000 16.06 3.04 633.2 493660000 87.998 5.672 9.82 8.71 1.47 3.05 725840000 30.59 2622.9 22.6977 2961100000 94.8348 101.938 35.99 107.452 9800.5 949400000 45.824 159.344 76.9608 7.91 90.5745 2069200000 46.6636 29.077 1219.9 26.27 23.69 3390700000 62.35 1134736.54 13323000 995259.68 236.666 47.2801 32338000 36.74 239.55 93.0098 1013200000 957.946 89.8447 117.006 170.906 104.4148 85.2423 181083180.47 699.97 199.103 15.708 992540000 151.138 24.637 516.17 164.8 556.5 48054.48 282.53 2195391.41 36.736 610.69 53918500 476.82 12743.81 21134.81 6879.86 87.9306 1090.424 152.456 94.834 1946.7 13172.81 49325396.97 96.18 43021501.4 9.63 14.12 148.177 1141859.25 10.01 5.44 2426350000 230.541 98.8315 18.15 39.37 14.42 18.04 91.9039 4.61 104.345 7.43 67.9483 16.13 1372429.58 45.56 16.72 90361.7 32.75 182.03 25.251 920435.77 70.7905 24.7827 92.7173 628202.55 20.4818 34807016.85 48.7210 174.114 42.19 198790000 10.430 12.365 11.64 623535000 46.4607 2636450000 20.9459 592.5 24.48 92.2568 1825450000 396265000 131100.25 1185500000 15.90 3.01 639.3 498410000 87.178 5.717 9.75 8.77 1.46 3.03 730310000 30.77 2607.6 22.5752 2945150000 94.3442 102.426 36.16 107.952 9756.7 945190000 45.636 158.711 77.2633 7.94 90.2388 2076650000 46.5052 29.176 1216.0 26.19 23.62 3381950000 62.51 1137612.61 13291000 992909.69 236.119 47.3856 32267000 36.82 239.03 92.8161 1011200000 956.127 90.0132 116.839 171.134 104.5539 85.1315 181314757.42 699.09 199.333 15.723 993445000 151.273 24.6207 516.50 164.7 556.8 48076.78 282.65 2196242.21 36.726 610.83 53926500 476.77 12742.70 21133.02 6880.22 87.9319 OpenBenchmarking.org
libxsmm M N K: 128 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 128 b a 400 800 1200 1600 2000 SE +/- 54.55, N = 2 1946.7 1055.3 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
Stress-NG Test: Cloning OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Cloning a b 3K 6K 9K 12K 15K SE +/- 3270.70, N = 2 SE +/- 654.66, N = 2 16195.03 13172.81 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Pipe OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Pipe b a 11M 22M 33M 44M 55M SE +/- 6369572.71, N = 2 SE +/- 2523742.74, N = 2 49325396.97 40500166.81 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Apache IoTDB Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 500 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 500 b a 20 40 60 80 100 96.18 109.40 MAX: 1249.92 MAX: 2142.92
Apache IoTDB Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 500 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 500 b a 9M 18M 27M 36M 45M 43021501.40 39562245.22
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: resnet18 b a 3 6 9 12 15 SE +/- 0.31, N = 2 SE +/- 1.04, N = 2 9.63 10.31 MIN: 9.16 / MAX: 26.03 MIN: 9.03 / MAX: 33.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Apache IoTDB Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 200 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.1.2 Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 200 a b 4 8 12 16 20 13.25 14.12 MAX: 896.77 MAX: 878.17
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128 a b 30 60 90 120 150 SE +/- 2.45, N = 2 156.22 148.18 1. (CXX) g++ options: -O3
Apache IoTDB Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 200 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 200 a b 300K 600K 900K 1200K 1500K 1199743.22 1141859.25
NCNN Target: CPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: FastestDet a b 3 6 9 12 15 SE +/- 0.05, N = 2 SE +/- 0.28, N = 2 9.62 10.01 MIN: 9.35 / MAX: 10.52 MIN: 9.4 / MAX: 59.43 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: alexnet b a 1.2713 2.5426 3.8139 5.0852 6.3565 SE +/- 0.22, N = 2 SE +/- 0.43, N = 2 5.44 5.65 MIN: 5.08 / MAX: 7.52 MIN: 5.03 / MAX: 6.71 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Liquid-DSP Threads: 128 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 128 - Buffer Length: 256 - Filter Length: 57 a b 500M 1000M 1500M 2000M 2500M SE +/- 13350000.00, N = 2 2519200000 2426350000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256 b a 50 100 150 200 250 SE +/- 3.31, N = 2 230.54 222.22 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256 a b 20 40 60 80 100 SE +/- 0.00, N = 2 102.28 98.83 1. (CXX) g++ options: -O3
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: resnet50 a b 4 8 12 16 20 SE +/- 0.36, N = 2 SE +/- 0.83, N = 2 17.57 18.15 MIN: 16.92 / MAX: 18.88 MIN: 16.98 / MAX: 42.81 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: regnety_400m a b 9 18 27 36 45 SE +/- 0.86, N = 2 SE +/- 1.10, N = 2 38.20 39.37 MIN: 36.18 / MAX: 62.76 MIN: 37.07 / MAX: 103.97 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Apache IoTDB Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 200 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.1.2 Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 200 b a 4 8 12 16 20 14.42 14.84 MAX: 596.84 MAX: 605.55
Apache IoTDB Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 200 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 200 a b 4 8 12 16 20 17.54 18.04 MAX: 680.16 MAX: 597.99
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128 a b 20 40 60 80 100 SE +/- 1.46, N = 2 94.45 91.90 1. (CXX) g++ options: -O3
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: blazeface a b 1.0373 2.0746 3.1119 4.1492 5.1865 SE +/- 0.09, N = 2 SE +/- 0.01, N = 2 4.49 4.61 MIN: 4.31 / MAX: 5.13 MIN: 4.49 / MAX: 5.33 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: float - X Y Z: 256 b a 20 40 60 80 100 SE +/- 0.34, N = 2 104.35 101.80 1. (CXX) g++ options: -O3
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: mnasnet b a 2 4 6 8 10 SE +/- 0.01, N = 2 SE +/- 0.07, N = 2 7.43 7.61 MIN: 7.16 / MAX: 15.37 MIN: 7.33 / MAX: 43.29 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double - X Y Z: 128 a b 15 30 45 60 75 SE +/- 1.01, N = 2 69.48 67.95 1. (CXX) g++ options: -O3
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: squeezenet_ssd a b 4 8 12 16 20 SE +/- 0.07, N = 2 SE +/- 0.41, N = 2 15.78 16.13 MIN: 15.4 / MAX: 43.08 MIN: 15.35 / MAX: 39.36 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Apache IoTDB Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 500 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 500 b a 300K 600K 900K 1200K 1500K 1372429.58 1343156.56
NCNN Target: CPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: vision_transformer b a 11 22 33 44 55 SE +/- 1.19, N = 2 SE +/- 2.46, N = 2 45.56 46.50 MIN: 43.24 / MAX: 70.59 MIN: 42.6 / MAX: 72.28 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: googlenet b a 4 8 12 16 20 SE +/- 0.37, N = 2 SE +/- 1.05, N = 2 16.72 17.06 MIN: 15.67 / MAX: 100.92 MIN: 15.5 / MAX: 66.12 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Stress-NG Test: Pthread OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Pthread a b 20K 40K 60K 80K 100K SE +/- 279.15, N = 2 SE +/- 894.90, N = 2 92131.54 90361.70 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Apache IoTDB Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 500 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.1.2 Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 500 b a 8 16 24 32 40 32.75 33.38 MAX: 992.49 MAX: 934.86
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: float - X Y Z: 128 a b 40 80 120 160 200 SE +/- 0.86, N = 2 185.45 182.03 1. (CXX) g++ options: -O3
Z3 Theorem Prover SMT File: 1.smt2 OpenBenchmarking.org Seconds, Fewer Is Better Z3 Theorem Prover 4.12.1 SMT File: 1.smt2 b a 6 12 18 24 30 SE +/- 0.07, N = 2 SE +/- 0.03, N = 2 25.25 25.71 1. (CXX) g++ options: -lpthread -std=c++17 -fvisibility=hidden -mfpmath=sse -msse -msse2 -O3 -fPIC
Apache IoTDB Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 200 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 200 b a 200K 400K 600K 800K 1000K 920435.77 904320.60
Embree Binary: Pathtracer - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer - Model: Crown a b 16 32 48 64 80 SE +/- 0.12, N = 2 72.04 70.79 MIN: 68.2 / MAX: 79.55 MIN: 67 / MAX: 79.71
OSPRay Benchmark: particle_volume/scivis/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: particle_volume/scivis/real_time b a 6 12 18 24 30 SE +/- 0.03, N = 2 24.78 24.36
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double - X Y Z: 512 a b 20 40 60 80 100 SE +/- 0.13, N = 2 SE +/- 0.73, N = 2 94.26 92.72 1. (CXX) g++ options: -O3
Apache IoTDB Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 200 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 200 a b 140K 280K 420K 560K 700K 638644.35 628202.55
OSPRay Benchmark: gravity_spheres_volume/dim_512/scivis/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: gravity_spheres_volume/dim_512/scivis/real_time a b 5 10 15 20 25 SE +/- 0.05, N = 2 20.81 20.48
Apache IoTDB Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 200 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 200 b a 7M 14M 21M 28M 35M 34807016.85 34266143.85
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512 a b 11 22 33 44 55 SE +/- 0.29, N = 2 SE +/- 0.39, N = 2 49.44 48.72 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: float - X Y Z: 512 a b 40 80 120 160 200 SE +/- 0.68, N = 2 SE +/- 0.12, N = 2 176.63 174.11 1. (CXX) g++ options: -O3
Apache IoTDB Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 200 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 200 b a 10 20 30 40 50 42.19 42.79 MAX: 784.56 MAX: 855.16
Liquid-DSP Threads: 16 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 16 - Buffer Length: 256 - Filter Length: 512 a b 40M 80M 120M 160M 200M SE +/- 795000.00, N = 2 SE +/- 650000.00, N = 2 201615000 198790000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
VVenC Video Input: Bosphorus 4K - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.9 Video Input: Bosphorus 4K - Video Preset: Faster b a 3 6 9 12 15 SE +/- 0.10, N = 2 SE +/- 0.03, N = 2 10.43 10.28 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
Remhos Test: Sample Remap Example OpenBenchmarking.org Seconds, Fewer Is Better Remhos 1.0 Test: Sample Remap Example a b 3 6 9 12 15 SE +/- 0.10, N = 2 12.20 12.37 1. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: efficientnet-b0 a b 3 6 9 12 15 SE +/- 0.23, N = 2 SE +/- 0.26, N = 2 11.48 11.64 MIN: 10.9 / MAX: 56.34 MIN: 10.85 / MAX: 37.54 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Liquid-DSP Threads: 16 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 16 - Buffer Length: 256 - Filter Length: 57 b a 130M 260M 390M 520M 650M SE +/- 11305000.00, N = 2 SE +/- 3155000.00, N = 2 623535000 615105000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256 b a 11 22 33 44 55 SE +/- 0.55, N = 2 46.46 45.85 1. (CXX) g++ options: -O3
Liquid-DSP Threads: 160 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 160 - Buffer Length: 256 - Filter Length: 57 b a 600M 1200M 1800M 2400M 3000M SE +/- 17250000.00, N = 2 2636450000 2602300000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OSPRay Benchmark: gravity_spheres_volume/dim_512/ao/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: gravity_spheres_volume/dim_512/ao/real_time a b 5 10 15 20 25 SE +/- 0.22, N = 2 21.21 20.95
libxsmm M N K: 256 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 256 a b 130 260 390 520 650 SE +/- 2.65, N = 2 599.8 592.5 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: yolov4-tiny b a 6 12 18 24 30 SE +/- 0.64, N = 2 SE +/- 0.65, N = 2 24.48 24.77 MIN: 22.66 / MAX: 47.54 MIN: 22.68 / MAX: 208.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: float - X Y Z: 512 a b 20 40 60 80 100 SE +/- 0.40, N = 2 SE +/- 0.24, N = 2 93.33 92.26 1. (CXX) g++ options: -O3
Liquid-DSP Threads: 64 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 32 b a 400M 800M 1200M 1600M 2000M SE +/- 2750000.00, N = 2 1825450000 1805000000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 32 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 32 - Buffer Length: 256 - Filter Length: 512 a b 90M 180M 270M 360M 450M SE +/- 2775000.00, N = 2 400730000 396265000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Stress-NG Test: Vector Floating Point OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Vector Floating Point a b 30K 60K 90K 120K 150K SE +/- 872.22, N = 2 SE +/- 235.00, N = 2 132479.08 131100.25 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Liquid-DSP Threads: 32 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 32 - Buffer Length: 256 - Filter Length: 57 a b 300M 600M 900M 1200M 1500M SE +/- 20600000.00, N = 2 1197700000 1185500000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: mobilenet b a 4 8 12 16 20 SE +/- 0.28, N = 2 SE +/- 0.82, N = 2 15.90 16.06 MIN: 15.2 / MAX: 39.56 MIN: 14.92 / MAX: 25.43 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Intel Open Image Denoise Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.0 Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only a b 0.684 1.368 2.052 2.736 3.42 SE +/- 0.00, N = 2 3.04 3.01
libxsmm M N K: 32 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 32 b a 140 280 420 560 700 SE +/- 2.35, N = 2 639.3 633.2 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
Liquid-DSP Threads: 16 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 16 - Buffer Length: 256 - Filter Length: 32 b a 110M 220M 330M 440M 550M SE +/- 830000.00, N = 2 SE +/- 1500000.00, N = 2 498410000 493660000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Z3 Theorem Prover SMT File: 2.smt2 OpenBenchmarking.org Seconds, Fewer Is Better Z3 Theorem Prover 4.12.1 SMT File: 2.smt2 b a 20 40 60 80 100 SE +/- 0.05, N = 2 SE +/- 0.02, N = 2 87.18 88.00 1. (CXX) g++ options: -lpthread -std=c++17 -fvisibility=hidden -mfpmath=sse -msse -msse2 -O3 -fPIC
VVenC Video Input: Bosphorus 4K - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.9 Video Input: Bosphorus 4K - Video Preset: Fast b a 1.2863 2.5726 3.8589 5.1452 6.4315 SE +/- 0.063, N = 2 SE +/- 0.019, N = 2 5.717 5.672 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: shufflenet-v2 b a 3 6 9 12 15 SE +/- 0.06, N = 2 SE +/- 0.03, N = 2 9.75 9.82 MIN: 9.56 / MAX: 13.68 MIN: 9.6 / MAX: 12.61 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3 - Model: mobilenet-v3 a b 2 4 6 8 10 SE +/- 0.14, N = 2 SE +/- 0.03, N = 2 8.71 8.77 MIN: 8.43 / MAX: 9.8 MIN: 8.59 / MAX: 32.78 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Intel Open Image Denoise Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.0 Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only a b 0.3308 0.6616 0.9924 1.3232 1.654 SE +/- 0.00, N = 2 1.47 1.46
Intel Open Image Denoise Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.0 Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only a b 0.6863 1.3726 2.0589 2.7452 3.4315 SE +/- 0.01, N = 2 3.05 3.03
Liquid-DSP Threads: 64 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 512 b a 160M 320M 480M 640M 800M SE +/- 1250000.00, N = 2 730310000 725840000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Blender Blend File: Fishy Cat - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Fishy Cat - Compute: CPU-Only a b 7 14 21 28 35 SE +/- 0.01, N = 2 SE +/- 0.02, N = 2 30.59 30.77
QuantLib OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.30 a b 600 1200 1800 2400 3000 SE +/- 2.00, N = 2 SE +/- 1.80, N = 2 2622.9 2607.6 1. (CXX) g++ options: -O3 -march=native -fPIE -pie
OSPRay Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time a b 5 10 15 20 25 SE +/- 0.00, N = 2 22.70 22.58
Liquid-DSP Threads: 128 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 128 - Buffer Length: 256 - Filter Length: 32 a b 600M 1200M 1800M 2400M 3000M SE +/- 6350000.00, N = 2 2961100000 2945150000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512 a b 20 40 60 80 100 SE +/- 1.11, N = 2 SE +/- 0.95, N = 2 94.83 94.34 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double - X Y Z: 256 b a 20 40 60 80 100 SE +/- 0.72, N = 2 102.43 101.94 1. (CXX) g++ options: -O3
Apache IoTDB Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 500 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 500 a b 8 16 24 32 40 35.99 36.16 MAX: 724.8 MAX: 769.5
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: float - X Y Z: 128 b a 20 40 60 80 100 SE +/- 1.54, N = 2 107.95 107.45 1. (CXX) g++ options: -O3
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Total OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.5 Test: PUSCH Processor Benchmark, Throughput Total a b 2K 4K 6K 8K 10K SE +/- 47.35, N = 2 SE +/- 33.95, N = 2 9800.5 9756.7 1. (CXX) g++ options: -march=native -mfma -O3 -fno-trapping-math -fno-math-errno -lgtest
Liquid-DSP Threads: 128 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 128 - Buffer Length: 256 - Filter Length: 512 a b 200M 400M 600M 800M 1000M SE +/- 2180000.00, N = 2 949400000 945190000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
GPAW Input: Carbon Nanotube OpenBenchmarking.org Seconds, Fewer Is Better GPAW 23.6 Input: Carbon Nanotube b a 10 20 30 40 50 SE +/- 0.03, N = 2 SE +/- 0.02, N = 2 45.64 45.82 1. (CC) gcc options: -shared -fwrapv -O2 -lxc -lblas -lmpi
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128 a b 40 80 120 160 200 SE +/- 0.21, N = 2 159.34 158.71 1. (CXX) g++ options: -O3
Embree Binary: Pathtracer - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer - Model: Asian Dragon Obj b a 20 40 60 80 100 SE +/- 0.03, N = 2 SE +/- 0.03, N = 2 77.26 76.96 MIN: 75.78 / MAX: 81.08 MIN: 75.53 / MAX: 82.14
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v2-v2 - Model: mobilenet-v2 a b 2 4 6 8 10 SE +/- 0.13, N = 2 SE +/- 0.02, N = 2 7.91 7.94 MIN: 7.68 / MAX: 9.6 MIN: 7.81 / MAX: 10.99 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512 a b 20 40 60 80 100 SE +/- 1.08, N = 2 SE +/- 1.17, N = 2 90.57 90.24 1. (CXX) g++ options: -O3
Liquid-DSP Threads: 64 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 57 b a 400M 800M 1200M 1600M 2000M SE +/- 13650000.00, N = 2 2076650000 2069200000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double - X Y Z: 256 a b 11 22 33 44 55 SE +/- 0.08, N = 2 46.66 46.51 1. (CXX) g++ options: -O3
VVenC Video Input: Bosphorus 1080p - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.9 Video Input: Bosphorus 1080p - Video Preset: Faster b a 7 14 21 28 35 SE +/- 0.23, N = 2 SE +/- 0.37, N = 2 29.18 29.08 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
libxsmm M N K: 64 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 64 a b 300 600 900 1200 1500 SE +/- 1.25, N = 2 1219.9 1216.0 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: vgg16 b a 6 12 18 24 30 SE +/- 0.34, N = 2 SE +/- 0.84, N = 2 26.19 26.27 MIN: 24.19 / MAX: 341.6 MIN: 24.05 / MAX: 301.35 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: BMW27 - Compute: CPU-Only b a 6 12 18 24 30 SE +/- 0.06, N = 2 SE +/- 0.09, N = 2 23.62 23.69
Liquid-DSP Threads: 160 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 160 - Buffer Length: 256 - Filter Length: 32 a b 700M 1400M 2100M 2800M 3500M SE +/- 9150000.00, N = 2 3390700000 3381950000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Classroom - Compute: CPU-Only a b 14 28 42 56 70 SE +/- 0.04, N = 2 SE +/- 0.19, N = 2 62.35 62.51
Apache IoTDB Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 500 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 500 b a 200K 400K 600K 800K 1000K 1137612.61 1134736.54
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 1 - Buffer Length: 256 - Filter Length: 512 a b 3M 6M 9M 12M 15M SE +/- 1000.00, N = 2 SE +/- 34000.00, N = 2 13323000 13291000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Apache IoTDB Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 500 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 500 a b 200K 400K 600K 800K 1000K 995259.68 992909.69
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: float - X Y Z: 256 a b 50 100 150 200 250 SE +/- 2.75, N = 2 236.67 236.12 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double - X Y Z: 512 b a 11 22 33 44 55 SE +/- 0.07, N = 2 SE +/- 0.14, N = 2 47.39 47.28 1. (CXX) g++ options: -O3
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 1 - Buffer Length: 256 - Filter Length: 32 a b 7M 14M 21M 28M 35M SE +/- 0.00, N = 2 SE +/- 0.00, N = 2 32338000 32267000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Apache IoTDB Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 500 OpenBenchmarking.org Average Latency, Fewer Is Better Apache IoTDB 1.1.2 Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 500 a b 8 16 24 32 40 36.74 36.82 MAX: 793.88 MAX: 691.5
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Barbershop - Compute: CPU-Only b a 50 100 150 200 250 SE +/- 1.32, N = 2 SE +/- 0.32, N = 2 239.03 239.55
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256 a b 20 40 60 80 100 SE +/- 1.52, N = 2 93.01 92.82 1. (CXX) g++ options: -O3
Liquid-DSP Threads: 160 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 160 - Buffer Length: 256 - Filter Length: 512 a b 200M 400M 600M 800M 1000M SE +/- 1800000.00, N = 2 1013200000 1011200000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Timed GCC Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed GCC Compilation 13.2 Time To Compile b a 200 400 600 800 1000 SE +/- 1.97, N = 2 956.13 957.95
Embree Binary: Pathtracer ISPC - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer ISPC - Model: Asian Dragon Obj b a 20 40 60 80 100 SE +/- 0.05, N = 2 SE +/- 0.06, N = 2 90.01 89.84 MIN: 87.6 / MAX: 94.43 MIN: 87.68 / MAX: 94.71
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double - X Y Z: 128 a b 30 60 90 120 150 SE +/- 4.04, N = 2 117.01 116.84 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512 b a 40 80 120 160 200 SE +/- 1.39, N = 2 SE +/- 1.25, N = 2 171.13 170.91 1. (CXX) g++ options: -O3
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer ISPC - Model: Asian Dragon b a 20 40 60 80 100 SE +/- 0.24, N = 2 SE +/- 0.32, N = 2 104.55 104.41 MIN: 102.2 / MAX: 108.91 MIN: 101.88 / MAX: 109.22
Embree Binary: Pathtracer - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer - Model: Asian Dragon a b 20 40 60 80 100 SE +/- 0.04, N = 2 SE +/- 0.14, N = 2 85.24 85.13 MIN: 83.75 / MAX: 89.99 MIN: 83.65 / MAX: 90.45
Stress-NG Test: Fused Multiply-Add OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Fused Multiply-Add b a 40M 80M 120M 160M 200M SE +/- 92010.25, N = 2 SE +/- 118686.48, N = 2 181314757.42 181083180.47 1. (CXX) g++ options: -O2 -std=gnu99 -lc
dav1d Video Input: Summer Nature 1080p OpenBenchmarking.org FPS, More Is Better dav1d 1.2.1 Video Input: Summer Nature 1080p a b 150 300 450 600 750 SE +/- 0.50, N = 2 699.97 699.09 1. (CC) gcc options: -pthread -lm
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128 b a 40 80 120 160 200 SE +/- 0.39, N = 2 199.33 199.10 1. (CXX) g++ options: -O3
VVenC Video Input: Bosphorus 1080p - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.9 Video Input: Bosphorus 1080p - Video Preset: Fast b a 4 8 12 16 20 SE +/- 0.04, N = 2 SE +/- 0.06, N = 2 15.72 15.71 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
Liquid-DSP Threads: 32 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 32 - Buffer Length: 256 - Filter Length: 32 b a 200M 400M 600M 800M 1000M SE +/- 2085000.00, N = 2 993445000 992540000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OSPRay Benchmark: particle_volume/pathtracer/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: particle_volume/pathtracer/real_time b a 30 60 90 120 150 SE +/- 0.63, N = 2 151.27 151.14
OSPRay Benchmark: particle_volume/ao/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: particle_volume/ao/real_time a b 6 12 18 24 30 SE +/- 0.09, N = 2 24.64 24.62
dav1d Video Input: Chimera 1080p OpenBenchmarking.org FPS, More Is Better dav1d 1.2.1 Video Input: Chimera 1080p b a 110 220 330 440 550 SE +/- 0.06, N = 2 516.50 516.17 1. (CC) gcc options: -pthread -lm
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Thread OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.5 Test: PUSCH Processor Benchmark, Throughput Thread a b 40 80 120 160 200 SE +/- 1.70, N = 2 SE +/- 0.90, N = 2 164.8 164.7 1. (CXX) g++ options: -march=native -mfma -O3 -fno-trapping-math -fno-math-errno -lgtest
srsRAN Project Test: Downlink Processor Benchmark OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.5 Test: Downlink Processor Benchmark b a 120 240 360 480 600 SE +/- 1.25, N = 2 SE +/- 0.70, N = 2 556.8 556.5 1. (CXX) g++ options: -march=native -mfma -O3 -fno-trapping-math -fno-math-errno -lgtest
Stress-NG Test: Vector Shuffle OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Vector Shuffle b a 10K 20K 30K 40K 50K SE +/- 42.22, N = 2 SE +/- 1.26, N = 2 48076.78 48054.48 1. (CXX) g++ options: -O2 -std=gnu99 -lc
dav1d Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better dav1d 1.2.1 Video Input: Summer Nature 4K b a 60 120 180 240 300 SE +/- 0.08, N = 2 282.65 282.53 1. (CC) gcc options: -pthread -lm
Stress-NG Test: Wide Vector Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Wide Vector Math b a 500K 1000K 1500K 2000K 2500K SE +/- 497.69, N = 2 SE +/- 1200.16, N = 2 2196242.21 2195391.41 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.4 WAV To Opus Encode b a 8 16 24 32 40 SE +/- 0.01, N = 2 36.73 36.74 1. (CXX) g++ options: -O3 -fvisibility=hidden -logg -lm
Stress-NG Test: AVL Tree OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: AVL Tree b a 130 260 390 520 650 SE +/- 0.44, N = 2 SE +/- 0.08, N = 2 610.83 610.69 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 1 - Buffer Length: 256 - Filter Length: 57 b a 12M 24M 36M 48M 60M SE +/- 1500.00, N = 2 SE +/- 500.00, N = 2 53926500 53918500 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
dav1d Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better dav1d 1.2.1 Video Input: Chimera 1080p 10-bit a b 100 200 300 400 500 SE +/- 0.41, N = 2 476.82 476.77 1. (CC) gcc options: -pthread -lm
Stress-NG Test: Matrix 3D Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Matrix 3D Math a b 3K 6K 9K 12K 15K SE +/- 9.80, N = 2 SE +/- 5.06, N = 2 12743.81 12742.70 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Floating Point OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Floating Point a b 5K 10K 15K 20K 25K SE +/- 6.86, N = 2 SE +/- 9.14, N = 2 21134.81 21133.02 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Zlib OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Zlib b a 1500 3000 4500 6000 7500 SE +/- 4.39, N = 2 SE +/- 8.83, N = 2 6880.22 6879.86 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer ISPC - Model: Crown b a 20 40 60 80 100 SE +/- 0.10, N = 2 87.93 87.93 MIN: 84.73 / MAX: 92.37 MIN: 85.27 / MAX: 92.58
Apache CouchDB Bulk Size: 500 - Inserts: 1000 - Rounds: 30 OpenBenchmarking.org Seconds, Fewer Is Better Apache CouchDB 3.3.2 Bulk Size: 500 - Inserts: 1000 - Rounds: 30 a 200 400 600 800 1000 1090.42 1. (CXX) g++ options: -std=c++17 -lmozjs-78 -lm -lei -fPIC -MMD
Apache CouchDB Bulk Size: 300 - Inserts: 1000 - Rounds: 30 OpenBenchmarking.org Seconds, Fewer Is Better Apache CouchDB 3.3.2 Bulk Size: 300 - Inserts: 1000 - Rounds: 30 a 30 60 90 120 150 152.46 1. (CXX) g++ options: -std=c++17 -lmozjs-78 -lm -lei -fPIC -MMD
Apache CouchDB Bulk Size: 100 - Inserts: 1000 - Rounds: 30 OpenBenchmarking.org Seconds, Fewer Is Better Apache CouchDB 3.3.2 Bulk Size: 100 - Inserts: 1000 - Rounds: 30 a 20 40 60 80 100 94.83 1. (CXX) g++ options: -std=c++17 -lmozjs-78 -lm -lei -fPIC -MMD
Phoronix Test Suite v10.8.5