dddxxx Tests for a future article. Intel Core i7-8565U testing with a Dell 0KTW76 (1.17.0 BIOS) and Intel UHD 620 WHL GT2 15GB on Ubuntu 22.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2308061-NE-DDDXXX46317&rdt&grs .
dddxxx Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Desktop Display Server OpenGL OpenCL Vulkan Compiler File-System Screen Resolution a b Intel Core i7-8565U @ 4.60GHz (4 Cores / 8 Threads) Dell 0KTW76 (1.17.0 BIOS) Intel Cannon Point-LP 16GB SK hynix PC401 NVMe 256GB Intel UHD 620 WHL GT2 15GB (1150MHz) Realtek ALC3271 Qualcomm Atheros QCA6174 802.11ac Ubuntu 22.04 5.19.0-rc6-phx-retbleed (x86_64) GNOME Shell 42.2 X Server + Wayland 4.6 Mesa 22.0.1 OpenCL 3.0 1.3.204 GCC 11.3.0 ext4 1920x1080 GCC 11.4.0 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - a: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - b: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Disk Details - NONE / errors=remount-ro,relatime,rw / Block Size: 4096 Processor Details - Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0xf0 - Thermald 2.4.9 Java Details - a: OpenJDK Runtime Environment (build 11.0.18+10-post-Ubuntu-0ubuntu122.04) - b: OpenJDK Runtime Environment (build 11.0.20+8-post-Ubuntu-1ubuntu122.04) Python Details - a: Python 3.10.6 - b: Python 3.10.12 Security Details - itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Mitigation of Clear buffers; SMT vulnerable + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + retbleed: Mitigation of IBRS + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of IBRS IBPB: conditional RSB filling + srbds: Mitigation of Microcode + tsx_async_abort: Not affected
dddxxx sqlite: 1 deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream ncnn: CPU - shufflenet-v2 memtier-benchmark: Redis - 500 - 1:5 sqlite: 2 sqlite: 4 memtier-benchmark: Redis - 500 - 1:10 stress-ng: MMAP memtier-benchmark: Redis - 100 - 1:5 memtier-benchmark: Redis - 50 - 1:5 ncnn: CPU - mnasnet memtier-benchmark: Redis - 100 - 1:10 stress-ng: MEMFD memtier-benchmark: Redis - 50 - 1:10 stress-ng: Futex stress-ng: Forking deepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Stream stress-ng: CPU Cache stress-ng: Crypto stress-ng: Zlib svt-av1: Preset 12 - Bosphorus 4K stress-ng: Hash deepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Stream stress-ng: Pthread ncnn: CPU-v3-v3 - mobilenet-v3 stress-ng: SENDFILE deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream stress-ng: AVL Tree deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream stress-ng: Malloc svt-av1: Preset 8 - Bosphorus 1080p svt-av1: Preset 4 - Bosphorus 4K stress-ng: Mutex ospray: gravity_spheres_volume/dim_512/scivis/real_time dav1d: Summer Nature 1080p apache-iotdb: 200 - 100 - 500 apache-iotdb: 100 - 100 - 500 apache-iotdb: 100 - 100 - 500 stress-ng: NUMA svt-av1: Preset 4 - Bosphorus 1080p stress-ng: Poll stress-ng: Atomic svt-av1: Preset 13 - Bosphorus 4K apache-iotdb: 200 - 100 - 500 ospray: particle_volume/ao/real_time stress-ng: Fused Multiply-Add ncnn: CPU-v2-v2 - mobilenet-v2 deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream stress-ng: Vector Floating Point stress-ng: Pipe stress-ng: CPU Stress stress-ng: IO_uring stress-ng: Glibc Qsort Data Sorting apache-iotdb: 500 - 100 - 200 apache-iotdb: 500 - 1 - 500 apache-iotdb: 500 - 100 - 500 stress-ng: Cloning ncnn: CPU - mobilenet liquid-dsp: 4 - 256 - 57 deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream ncnn: Vulkan GPU - regnety_400m liquid-dsp: 8 - 256 - 57 stress-ng: Glibc C String Functions stress-ng: System V Message Passing deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream stress-ng: Function Call ospray: particle_volume/pathtracer/real_time ncnn: CPU - resnet50 ospray: gravity_spheres_volume/dim_512/pathtracer/real_time libxsmm: 128 ncnn: CPU - squeezenet_ssd embree: Pathtracer ISPC - Asian Dragon Obj apache-iotdb: 500 - 100 - 500 liquid-dsp: 8 - 256 - 32 stress-ng: x86_64 RdRand apache-iotdb: 500 - 1 - 200 apache-iotdb: 500 - 1 - 500 stress-ng: Context Switching apache-iotdb: 200 - 1 - 200 apache-iotdb: 500 - 100 - 200 ncnn: CPU - efficientnet-b0 ncnn: Vulkan GPU - FastestDet apache-iotdb: 100 - 100 - 200 stress-ng: Matrix 3D Math apache-iotdb: 100 - 100 - 200 deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream ncnn: Vulkan GPU - squeezenet_ssd memcached: 1:10 deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream vvenc: Bosphorus 1080p - Fast ncnn: CPU - FastestDet ncnn: Vulkan GPU - mobilenet deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream apache-iotdb: 200 - 1 - 500 stress-ng: Floating Point build2: Time To Compile ncnn: CPU - yolov4-tiny deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream dav1d: Chimera 1080p liquid-dsp: 8 - 256 - 512 ncnn: CPU - blazeface ospray: particle_volume/scivis/real_time memcached: 1:100 ncnn: Vulkan GPU - blazeface svt-av1: Preset 12 - Bosphorus 1080p embree: Pathtracer - Crown svt-av1: Preset 13 - Bosphorus 1080p ncnn: Vulkan GPU - vision_transformer deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream stress-ng: Vector Math deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream ncnn: Vulkan GPU - vgg16 ospray: gravity_spheres_volume/dim_512/ao/real_time apache-iotdb: 100 - 1 - 200 ncnn: CPU - vgg16 apache-iotdb: 100 - 1 - 500 memcached: 1:5 ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - efficientnet-b0 build-llvm: Unix Makefiles z3: 1.smt2 ncnn: Vulkan GPU - resnet50 apache-iotdb: 200 - 100 - 200 stress-ng: Matrix Math embree: Pathtracer ISPC - Crown deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream quantlib: liquid-dsp: 2 - 256 - 512 apache-iotdb: 100 - 1 - 500 dav1d: Summer Nature 4K apache-iotdb: 100 - 1 - 200 stress-ng: Memory Copying apache-iotdb: 200 - 100 - 200 ncnn: CPU - regnety_400m build-godot: Time To Compile deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream vvenc: Bosphorus 4K - Fast deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream vvenc: Bosphorus 4K - Faster ncnn: CPU - vision_transformer deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream stress-ng: Wide Vector Math embree: Pathtracer - Asian Dragon Obj stress-ng: Semaphores deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream liquid-dsp: 1 - 256 - 32 libxsmm: 32 ncnn: Vulkan GPU - googlenet build-llvm: Ninja ncnn: Vulkan GPU - mnasnet svt-av1: Preset 8 - Bosphorus 4K ncnn: Vulkan GPU - yolov4-tiny xonotic: 1920 x 1080 - Ultra stress-ng: Socket Activity embree: Pathtracer ISPC - Asian Dragon apache-iotdb: 200 - 1 - 200 deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream embree: Pathtracer - Asian Dragon apache-iotdb: 500 - 1 - 200 xonotic: 1920 x 1080 - Ultimate z3: 2.smt2 ncnn: CPU - alexnet apache-iotdb: 200 - 1 - 500 ncnn: Vulkan GPU - shufflenet-v2 ncnn: CPU - googlenet liquid-dsp: 2 - 256 - 57 liquid-dsp: 4 - 256 - 32 xonotic: 1920 x 1080 - High dav1d: Chimera 1080p 10-bit deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream ncnn: Vulkan GPU - alexnet liquid-dsp: 1 - 256 - 512 vvenc: Bosphorus 1080p - Faster liquid-dsp: 2 - 256 - 32 libxsmm: 64 deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream encode-opus: WAV To Opus Encode ncnn: CPU - resnet18 xonotic: 1920 x 1080 - Low liquid-dsp: 4 - 256 - 512 cassandra: Writes stress-ng: Vector Shuffle liquid-dsp: 1 - 256 - 57 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 oidn: RTLightmap.hdr.4096x4096 - CPU-Only oidn: RT.ldr_alb_nrm.3840x2160 - CPU-Only oidn: RT.hdr_alb_nrm.3840x2160 - CPU-Only vkpeak: fp32-scalar a b 30.327 21.2985 93.8482 4.29 890374.23 123.465 125.830 914462.02 20.86 1016703.27 1039311.12 6.04 1028641.19 56.59 1109812.44 486078.68 9551.79 764.2676 929178.43 4590.53 273.05 35.497 523699.70 2.6677 34974.26 5.14 36779.86 211.9830 17.5124 9.4553 116.2683 16.41 2.0309 979.1797 375933.99 35.739 1.092 604156.72 0.622182 273.35 550.47 342.21 13467917.31 53.95 4.260 284300.80 224.51 34.049 8609112.97 1.49334 2978232.73 6.69 2.9815 7289.32 1422109.84 7310.05 170492.16 69.13 212.43 75.4 4948660.19 669.41 25.93 119590000 669.1809 9.82 140345000 2485856.07 2652462.62 162.1647 2058.70 52.5904 36.02 1.06164 79.9 16.55 4.1670 984.7 177945000 3267.31 15.75 611157.73 705297.96 17.34 8900311.03 9.78 5.66 15017343.49 383.49 107.84 12.5441 15.97 502050.43 65.9517 3.905 5.39 28.15 30.5673 1333775.59 743.83 555.936 38.37 29.6585 68.0491 244.12 35719000 0.91 1.37399 496309.76 0.94 154.597 3.5218 202.852 240.03 69.5593 12666.19 28.9185 97.70 0.745146 22.62 97.42 996242.81 525368.39 12.73 9.22 2967.528 43.973 33.18 12613348.93 17318.7 3.7369 855.6769 2379.8 15179500 34.81 67.01 529599 1078.73 144.53 10.03 1372.604 10.2465 1.180 137.9489 2.693 233.96 2.3527 147314.85 3.8229 3218725.51 196.7843 46516500 47.7 16.29 2892.088 5.23 9.490 39.39 77.9338736 2495.57 4.7883 816344.2 14.6720 244.5617 4.2183 1036429.93 59.1708882 130.941 11.68 29.65 3.66 16.23 73795500 140280000 91.1145989 191.30 8.3009 11.67 7721800 9.882 85550500 90.9 65.4250 30.5398 35.396 12.61 205.6062326 25830500 26422 2431.22 43429000 5.11 6.67 0.06 0.12 0.12 268.33 88.585 34.3231 58.2368 3.02 1248610.70 170.014 173.045 1253523.69 28.46 1372033.18 1398883.01 4.49 1377833.48 43.09 1455171.86 624014.17 12140.60 614.6793 1153192.75 5684.53 335.91 28.978 638650.51 3.2461 42229.61 4.31 43752.37 179.0772 20.7264 11.1719 98.7143 19.3 2.3883 835.8517 434156.10 31.014 0.948 694562.47 0.709986 310.94 485.93 387.15 12006039.14 60.49 4.773 318466.50 249.99 30.975 9394851.37 1.3692 2737483.30 6.15 3.2320 7877.70 1533355.11 7866.4 182936.68 74.13 198.41 70.46 4629287.79 715.58 27.68 112205000 628.3639 10.41 132830000 2360589.10 2519916.10 154.2586 2164.20 50.0829 34.33 1.012953 83.7 15.80 3.9923 1026.93 185540000 3134.04 15.11 637041.62 733269.79 16.68 9246224.42 9.43 5.47 15526243.84 396.03 104.43 12.9502 15.48 486807.49 68.0003 3.790 5.55 27.38 29.7603 1298988.31 762.86 569.609 39.31 28.9622 69.6181 249.62 36523000 0.93 1.34467 485922.63 0.96 157.692 3.4542 206.769 235.51 70.8213 12444.33 28.4640 96.26 0.734783 22.93 98.75 1009840.28 518308.20 12.56 9.34 2929.911 43.419 33.60 12464073 17116.79 3.6962 864.9932 2404.8 15023500 34.48 66.38 524660.18 1088.75 143.24 10.12 1384.766 10.336 1.170 139.0475 2.672 232.17 2.3352 148388.79 3.8503 3241408.21 195.4858 46825000 48.0 16.19 2908.878 5.26 9.436 39.17 77.5195887 2508.63 4.7636 812170.86 14.6069 245.6430 4.1998 1040683.13 58.9499391 131.405 11.64 29.75 3.67 16.19 73620500 139950000 90.9080095 190.87 8.2832 11.69 7710850 9.868 85664000 90.8 65.3564 30.5664 35.426 12.62 205.4486260 25812500 26438 2431.01 43427500 5.11 6.67 0.06 0.12 0.12 268.33 OpenBenchmarking.org
SQLite Threads / Copies: 1 OpenBenchmarking.org Seconds, Fewer Is Better SQLite 3.41.2 Threads / Copies: 1 a b 20 40 60 80 100 SE +/- 0.86, N = 2 SE +/- 0.19, N = 2 30.33 88.59 1. (CC) gcc options: -O2 -lreadline -ltermcap -lz -lm
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b 8 16 24 32 40 SE +/- 0.07, N = 2 SE +/- 0.13, N = 2 21.30 34.32
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b 20 40 60 80 100 SE +/- 0.32, N = 2 SE +/- 0.22, N = 2 93.85 58.24
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: shufflenet-v2 a b 0.9653 1.9306 2.8959 3.8612 4.8265 SE +/- 0.02, N = 2 SE +/- 0.01, N = 2 4.29 3.02 MIN: 3.99 / MAX: 24.93 MIN: 2.73 / MAX: 18.89 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Redis 7.0.12 + memtier_benchmark Protocol: Redis - Clients: 500 - Set To Get Ratio: 1:5 OpenBenchmarking.org Ops/sec, More Is Better Redis 7.0.12 + memtier_benchmark 2.0 Protocol: Redis - Clients: 500 - Set To Get Ratio: 1:5 a b 300K 600K 900K 1200K 1500K SE +/- 8374.42, N = 2 SE +/- 25673.44, N = 2 890374.23 1248610.70 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
SQLite Threads / Copies: 2 OpenBenchmarking.org Seconds, Fewer Is Better SQLite 3.41.2 Threads / Copies: 2 a b 40 80 120 160 200 SE +/- 24.93, N = 2 SE +/- 1.42, N = 2 123.47 170.01 1. (CC) gcc options: -O2 -lreadline -ltermcap -lz -lm
SQLite Threads / Copies: 4 OpenBenchmarking.org Seconds, Fewer Is Better SQLite 3.41.2 Threads / Copies: 4 a b 40 80 120 160 200 SE +/- 17.49, N = 2 SE +/- 0.50, N = 2 125.83 173.05 1. (CC) gcc options: -O2 -lreadline -ltermcap -lz -lm
Redis 7.0.12 + memtier_benchmark Protocol: Redis - Clients: 500 - Set To Get Ratio: 1:10 OpenBenchmarking.org Ops/sec, More Is Better Redis 7.0.12 + memtier_benchmark 2.0 Protocol: Redis - Clients: 500 - Set To Get Ratio: 1:10 a b 300K 600K 900K 1200K 1500K SE +/- 45279.14, N = 2 SE +/- 63299.47, N = 2 914462.02 1253523.69 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
Stress-NG Test: MMAP OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: MMAP a b 7 14 21 28 35 SE +/- 3.19, N = 2 SE +/- 3.70, N = 2 20.86 28.46 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
Redis 7.0.12 + memtier_benchmark Protocol: Redis - Clients: 100 - Set To Get Ratio: 1:5 OpenBenchmarking.org Ops/sec, More Is Better Redis 7.0.12 + memtier_benchmark 2.0 Protocol: Redis - Clients: 100 - Set To Get Ratio: 1:5 a b 300K 600K 900K 1200K 1500K SE +/- 44860.53, N = 2 SE +/- 39391.55, N = 2 1016703.27 1372033.18 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
Redis 7.0.12 + memtier_benchmark Protocol: Redis - Clients: 50 - Set To Get Ratio: 1:5 OpenBenchmarking.org Ops/sec, More Is Better Redis 7.0.12 + memtier_benchmark 2.0 Protocol: Redis - Clients: 50 - Set To Get Ratio: 1:5 a b 300K 600K 900K 1200K 1500K SE +/- 43465.56, N = 2 SE +/- 4729.88, N = 2 1039311.12 1398883.01 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: mnasnet a b 2 4 6 8 10 SE +/- 0.02, N = 2 SE +/- 0.02, N = 2 6.04 4.49 MIN: 5.41 / MAX: 26.6 MIN: 4.03 / MAX: 20.38 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Redis 7.0.12 + memtier_benchmark Protocol: Redis - Clients: 100 - Set To Get Ratio: 1:10 OpenBenchmarking.org Ops/sec, More Is Better Redis 7.0.12 + memtier_benchmark 2.0 Protocol: Redis - Clients: 100 - Set To Get Ratio: 1:10 a b 300K 600K 900K 1200K 1500K SE +/- 568.99, N = 2 SE +/- 8107.22, N = 2 1028641.19 1377833.48 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
Stress-NG Test: MEMFD OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: MEMFD a b 13 26 39 52 65 SE +/- 0.79, N = 2 SE +/- 0.72, N = 2 56.59 43.09 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
Redis 7.0.12 + memtier_benchmark Protocol: Redis - Clients: 50 - Set To Get Ratio: 1:10 OpenBenchmarking.org Ops/sec, More Is Better Redis 7.0.12 + memtier_benchmark 2.0 Protocol: Redis - Clients: 50 - Set To Get Ratio: 1:10 a b 300K 600K 900K 1200K 1500K SE +/- 19181.49, N = 2 SE +/- 18012.65, N = 2 1109812.44 1455171.86 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
Stress-NG Test: Futex OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Futex a b 130K 260K 390K 520K 650K SE +/- 36419.89, N = 2 SE +/- 57507.32, N = 2 486078.68 624014.17 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
Stress-NG Test: Forking OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Forking a b 3K 6K 9K 12K 15K SE +/- 301.28, N = 2 SE +/- 1330.94, N = 2 9551.79 12140.60 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Stream a b 160 320 480 640 800 SE +/- 113.73, N = 2 SE +/- 2.10, N = 2 764.27 614.68
Stress-NG Test: CPU Cache OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: CPU Cache a b 200K 400K 600K 800K 1000K SE +/- 244998.38, N = 2 SE +/- 138960.51, N = 2 929178.43 1153192.75 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
Stress-NG Test: Crypto OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Crypto a b 1200 2400 3600 4800 6000 SE +/- 47.72, N = 2 SE +/- 96.10, N = 2 4590.53 5684.53 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
Stress-NG Test: Zlib OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Zlib a b 70 140 210 280 350 SE +/- 31.94, N = 2 SE +/- 21.49, N = 2 273.05 335.91 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 12 - Input: Bosphorus 4K a b 8 16 24 32 40 SE +/- 0.91, N = 2 SE +/- 0.81, N = 2 35.50 28.98 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Stress-NG Test: Hash OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Hash a b 140K 280K 420K 560K 700K SE +/- 38337.11, N = 2 SE +/- 38690.57, N = 2 523699.70 638650.51 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Stream a b 0.7304 1.4608 2.1912 2.9216 3.652 SE +/- 0.3971, N = 2 SE +/- 0.0188, N = 2 2.6677 3.2461
Stress-NG Test: Pthread OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Pthread a b 9K 18K 27K 36K 45K SE +/- 4732.28, N = 2 SE +/- 4375.87, N = 2 34974.26 42229.61 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3 - Model: mobilenet-v3 a b 1.1565 2.313 3.4695 4.626 5.7825 SE +/- 0.80, N = 2 SE +/- 0.04, N = 2 5.14 4.31 MIN: 4.05 / MAX: 26.81 MIN: 4.05 / MAX: 20.17 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Stress-NG Test: SENDFILE OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: SENDFILE a b 9K 18K 27K 36K 45K SE +/- 1669.87, N = 2 SE +/- 1151.61, N = 2 36779.86 43752.37 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream a b 50 100 150 200 250 SE +/- 13.47, N = 2 SE +/- 3.79, N = 2 211.98 179.08
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream a b 5 10 15 20 25 SE +/- 2.36, N = 2 SE +/- 3.12, N = 2 17.51 20.73
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream a b 3 6 9 12 15 SE +/- 0.6045, N = 2 SE +/- 0.2364, N = 2 9.4553 11.1719
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream a b 30 60 90 120 150 SE +/- 15.66, N = 2 SE +/- 14.87, N = 2 116.27 98.71
Stress-NG Test: AVL Tree OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: AVL Tree a b 5 10 15 20 25 SE +/- 0.05, N = 2 SE +/- 0.15, N = 2 16.41 19.30 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream a b 0.5374 1.0748 1.6122 2.1496 2.687 SE +/- 0.0085, N = 2 SE +/- 0.0327, N = 2 2.0309 2.3883
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream a b 200 400 600 800 1000 SE +/- 0.29, N = 2 SE +/- 13.19, N = 2 979.18 835.85
Stress-NG Test: Malloc OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Malloc a b 90K 180K 270K 360K 450K SE +/- 39637.02, N = 2 SE +/- 46761.51, N = 2 375933.99 434156.10 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 8 - Input: Bosphorus 1080p a b 8 16 24 32 40 SE +/- 0.87, N = 2 SE +/- 2.58, N = 2 35.74 31.01 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 4 - Input: Bosphorus 4K a b 0.2457 0.4914 0.7371 0.9828 1.2285 SE +/- 0.023, N = 2 SE +/- 0.003, N = 2 1.092 0.948 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Stress-NG Test: Mutex OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Mutex a b 150K 300K 450K 600K 750K SE +/- 76155.40, N = 2 SE +/- 20730.70, N = 2 604156.72 694562.47 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
OSPRay Benchmark: gravity_spheres_volume/dim_512/scivis/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: gravity_spheres_volume/dim_512/scivis/real_time a b 0.1597 0.3194 0.4791 0.6388 0.7985 SE +/- 0.014296, N = 2 SE +/- 0.025213, N = 2 0.622182 0.709986
dav1d Video Input: Summer Nature 1080p OpenBenchmarking.org FPS, More Is Better dav1d 1.2.1 Video Input: Summer Nature 1080p a b 70 140 210 280 350 SE +/- 4.15, N = 2 SE +/- 2.55, N = 2 273.35 310.94 1. (CC) gcc options: -pthread -lm
Apache IoTDB Device Count: 200 - Batch Size Per Write: 100 - Sensor Count: 500 OpenBenchmarking.org Average Latency, More Is Better Apache IoTDB 1.1.2 Device Count: 200 - Batch Size Per Write: 100 - Sensor Count: 500 a b 120 240 360 480 600 550.47 485.93 MAX: 3020.53 MAX: 2859.52
Apache IoTDB Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 500 OpenBenchmarking.org Average Latency, More Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 500 a b 80 160 240 320 400 342.21 387.15 MAX: 2349.14 MAX: 2248.18
Apache IoTDB Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 500 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 500 a b 3M 6M 9M 12M 15M 13467917.31 12006039.14
Stress-NG Test: NUMA OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: NUMA a b 14 28 42 56 70 SE +/- 5.63, N = 2 SE +/- 5.27, N = 2 53.95 60.49 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 4 - Input: Bosphorus 1080p a b 1.0739 2.1478 3.2217 4.2956 5.3695 SE +/- 0.495, N = 2 SE +/- 0.109, N = 2 4.260 4.773 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Stress-NG Test: Poll OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Poll a b 70K 140K 210K 280K 350K SE +/- 20478.34, N = 2 SE +/- 23847.48, N = 2 284300.80 318466.50 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
Stress-NG Test: Atomic OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Atomic a b 50 100 150 200 250 SE +/- 10.78, N = 2 SE +/- 11.52, N = 2 224.51 249.99 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 13 - Input: Bosphorus 4K a b 8 16 24 32 40 SE +/- 2.10, N = 2 SE +/- 2.71, N = 2 34.05 30.98 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Apache IoTDB Device Count: 200 - Batch Size Per Write: 100 - Sensor Count: 500 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 200 - Batch Size Per Write: 100 - Sensor Count: 500 a b 2M 4M 6M 8M 10M 8609112.97 9394851.37
OSPRay Benchmark: particle_volume/ao/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: particle_volume/ao/real_time a b 0.336 0.672 1.008 1.344 1.68 SE +/- 0.09686, N = 2 SE +/- 0.02805, N = 2 1.49334 1.36920
Stress-NG Test: Fused Multiply-Add OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Fused Multiply-Add a b 600K 1200K 1800K 2400K 3000K SE +/- 63230.65, N = 2 SE +/- 262560.86, N = 2 2978232.73 2737483.30 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v2-v2 - Model: mobilenet-v2 a b 2 4 6 8 10 SE +/- 0.70, N = 2 SE +/- 0.07, N = 2 6.69 6.15 MIN: 5.6 / MAX: 28.58 MIN: 5.65 / MAX: 26.38 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream a b 0.7272 1.4544 2.1816 2.9088 3.636 SE +/- 0.0495, N = 2 SE +/- 0.4116, N = 2 2.9815 3.2320
Stress-NG Test: Vector Floating Point OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Vector Floating Point a b 2K 4K 6K 8K 10K SE +/- 364.11, N = 2 SE +/- 310.83, N = 2 7289.32 7877.70 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
Stress-NG Test: Pipe OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Pipe a b 300K 600K 900K 1200K 1500K SE +/- 30380.48, N = 2 SE +/- 131648.17, N = 2 1422109.84 1533355.11 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
Stress-NG Test: CPU Stress OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: CPU Stress a b 2K 4K 6K 8K 10K SE +/- 567.82, N = 2 SE +/- 158.65, N = 2 7310.05 7866.40 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
Stress-NG Test: IO_uring OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: IO_uring a b 40K 80K 120K 160K 200K SE +/- 2497.26, N = 2 SE +/- 499.46, N = 2 170492.16 182936.68 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
Stress-NG Test: Glibc Qsort Data Sorting OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Glibc Qsort Data Sorting a b 16 32 48 64 80 SE +/- 5.90, N = 2 SE +/- 3.74, N = 2 69.13 74.13 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
Apache IoTDB Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 200 OpenBenchmarking.org Average Latency, More Is Better Apache IoTDB 1.1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 200 a b 50 100 150 200 250 212.43 198.41 MAX: 2080.14 MAX: 2427.81
Apache IoTDB Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 500 OpenBenchmarking.org Average Latency, More Is Better Apache IoTDB 1.1.2 Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 500 a b 20 40 60 80 100 75.40 70.46 MAX: 1528.52 MAX: 1429.13
Apache IoTDB Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 500 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 500 a b 1.1M 2.2M 3.3M 4.4M 5.5M 4948660.19 4629287.79
Stress-NG Test: Cloning OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Cloning a b 150 300 450 600 750 SE +/- 28.62, N = 2 SE +/- 2.47, N = 2 669.41 715.58 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: mobilenet a b 7 14 21 28 35 SE +/- 0.19, N = 2 SE +/- 0.03, N = 2 25.93 27.68 MIN: 23.82 / MAX: 45.07 MIN: 24.92 / MAX: 47.55 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Liquid-DSP Threads: 4 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 4 - Buffer Length: 256 - Filter Length: 57 a b 30M 60M 90M 120M 150M SE +/- 6360000.00, N = 2 SE +/- 3395000.00, N = 2 119590000 112205000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream a b 140 280 420 560 700 SE +/- 12.84, N = 2 SE +/- 79.47, N = 2 669.18 628.36
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: regnety_400m a b 3 6 9 12 15 SE +/- 0.01, N = 2 SE +/- 0.07, N = 2 9.82 10.41 MIN: 9.26 / MAX: 25.33 MIN: 9.74 / MAX: 26.42 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Liquid-DSP Threads: 8 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 8 - Buffer Length: 256 - Filter Length: 57 a b 30M 60M 90M 120M 150M SE +/- 5615000.00, N = 2 SE +/- 7270000.00, N = 2 140345000 132830000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Stress-NG Test: Glibc C String Functions OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Glibc C String Functions a b 500K 1000K 1500K 2000K 2500K SE +/- 92139.37, N = 2 SE +/- 158119.57, N = 2 2485856.07 2360589.10 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
Stress-NG Test: System V Message Passing OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: System V Message Passing a b 600K 1200K 1800K 2400K 3000K SE +/- 119752.37, N = 2 SE +/- 200089.21, N = 2 2652462.62 2519916.10 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b 40 80 120 160 200 SE +/- 21.74, N = 2 SE +/- 1.05, N = 2 162.16 154.26
Stress-NG Test: Function Call OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Function Call a b 500 1000 1500 2000 2500 SE +/- 96.37, N = 2 SE +/- 40.41, N = 2 2058.70 2164.20 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
OSPRay Benchmark: particle_volume/pathtracer/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: particle_volume/pathtracer/real_time a b 12 24 36 48 60 SE +/- 0.59, N = 2 SE +/- 0.02, N = 2 52.59 50.08
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: resnet50 a b 8 16 24 32 40 SE +/- 0.05, N = 2 SE +/- 1.96, N = 2 36.02 34.33 MIN: 31.18 / MAX: 61.6 MIN: 30.77 / MAX: 59.68 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OSPRay Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time a b 0.2389 0.4778 0.7167 0.9556 1.1945 SE +/- 0.011835, N = 2 SE +/- 0.016867, N = 2 1.061640 1.012953
libxsmm M N K: 128 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 128 a b 20 40 60 80 100 SE +/- 4.60, N = 2 SE +/- 4.10, N = 2 79.9 83.7 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -march=core-avx2
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: squeezenet_ssd a b 4 8 12 16 20 SE +/- 0.30, N = 2 SE +/- 0.37, N = 2 16.55 15.80 MIN: 15.48 / MAX: 37.44 MIN: 13.81 / MAX: 36.58 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Embree Binary: Pathtracer ISPC - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer ISPC - Model: Asian Dragon Obj a b 0.9376 1.8752 2.8128 3.7504 4.688 SE +/- 0.0272, N = 2 SE +/- 0.1882, N = 2 4.1670 3.9923 MIN: 3.5 / MAX: 5.07 MIN: 3.51 / MAX: 5.03
Apache IoTDB Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 500 OpenBenchmarking.org Average Latency, More Is Better Apache IoTDB 1.1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 500 a b 200 400 600 800 1000 984.70 1026.93 MAX: 4944.15 MAX: 6211.26
Liquid-DSP Threads: 8 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 8 - Buffer Length: 256 - Filter Length: 32 a b 40M 80M 120M 160M 200M SE +/- 11475000.00, N = 2 SE +/- 5480000.00, N = 2 177945000 185540000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Stress-NG Test: x86_64 RdRand OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: x86_64 RdRand a b 700 1400 2100 2800 3500 SE +/- 13.37, N = 2 SE +/- 68.45, N = 2 3267.31 3134.04 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
Apache IoTDB Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 200 OpenBenchmarking.org Average Latency, More Is Better Apache IoTDB 1.1.2 Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 200 a b 4 8 12 16 20 15.75 15.11 MAX: 1425.54 MAX: 1458.64
Apache IoTDB Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 500 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 500 a b 140K 280K 420K 560K 700K 611157.73 637041.62
Stress-NG Test: Context Switching OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Context Switching a b 160K 320K 480K 640K 800K SE +/- 53162.51, N = 2 SE +/- 16234.88, N = 2 705297.96 733269.79 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
Apache IoTDB Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 200 OpenBenchmarking.org Average Latency, More Is Better Apache IoTDB 1.1.2 Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 200 a b 4 8 12 16 20 17.34 16.68 MAX: 866.08 MAX: 1251.48
Apache IoTDB Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 200 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 500 - Batch Size Per Write: 100 - Sensor Count: 200 a b 2M 4M 6M 8M 10M 8900311.03 9246224.42
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: efficientnet-b0 a b 3 6 9 12 15 SE +/- 0.65, N = 2 SE +/- 0.02, N = 2 9.78 9.43 MIN: 8.38 / MAX: 32.56 MIN: 8.62 / MAX: 26.5 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: FastestDet a b 1.2735 2.547 3.8205 5.094 6.3675 SE +/- 0.20, N = 2 SE +/- 0.04, N = 2 5.66 5.47 MIN: 5.23 / MAX: 24.86 MIN: 5.21 / MAX: 24.93 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Apache IoTDB Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 200 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 200 a b 3M 6M 9M 12M 15M 15017343.49 15526243.84
Stress-NG Test: Matrix 3D Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Matrix 3D Math a b 90 180 270 360 450 SE +/- 5.36, N = 2 SE +/- 6.17, N = 2 383.49 396.03 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
Apache IoTDB Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 200 OpenBenchmarking.org Average Latency, More Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 100 - Sensor Count: 200 a b 20 40 60 80 100 107.84 104.43 MAX: 1809.87 MAX: 2007.51
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b 3 6 9 12 15 SE +/- 1.68, N = 2 SE +/- 0.10, N = 2 12.54 12.95
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: squeezenet_ssd a b 4 8 12 16 20 SE +/- 0.47, N = 2 SE +/- 0.84, N = 2 15.97 15.48 MIN: 13.93 / MAX: 35.68 MIN: 13.66 / MAX: 36.39 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Memcached Set To Get Ratio: 1:10 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.19 Set To Get Ratio: 1:10 a b 110K 220K 330K 440K 550K SE +/- 9547.84, N = 2 SE +/- 4490.17, N = 2 502050.43 486807.49 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream a b 15 30 45 60 75 SE +/- 6.09, N = 2 SE +/- 7.55, N = 2 65.95 68.00
VVenC Video Input: Bosphorus 1080p - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.9 Video Input: Bosphorus 1080p - Video Preset: Fast a b 0.8786 1.7572 2.6358 3.5144 4.393 SE +/- 0.015, N = 2 SE +/- 0.042, N = 2 3.905 3.790 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
NCNN Target: CPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: FastestDet a b 1.2488 2.4976 3.7464 4.9952 6.244 SE +/- 0.04, N = 2 SE +/- 0.16, N = 2 5.39 5.55 MIN: 5.09 / MAX: 24.31 MIN: 5.23 / MAX: 25.23 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: mobilenet a b 7 14 21 28 35 SE +/- 0.19, N = 2 SE +/- 0.57, N = 2 28.15 27.38 MIN: 26.87 / MAX: 48.42 MIN: 24.23 / MAX: 47.57 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream a b 7 14 21 28 35 SE +/- 2.83, N = 2 SE +/- 3.30, N = 2 30.57 29.76
Apache IoTDB Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 500 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 500 a b 300K 600K 900K 1200K 1500K 1333775.59 1298988.31
Stress-NG Test: Floating Point OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Floating Point a b 160 320 480 640 800 SE +/- 25.00, N = 2 SE +/- 25.67, N = 2 743.83 762.86 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.15 Time To Compile a b 120 240 360 480 600 SE +/- 1.14, N = 2 SE +/- 4.22, N = 2 555.94 569.61
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: yolov4-tiny a b 9 18 27 36 45 SE +/- 0.08, N = 2 SE +/- 0.83, N = 2 38.37 39.31 MIN: 36.54 / MAX: 54.65 MIN: 36.44 / MAX: 59.39 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream a b 7 14 21 28 35 SE +/- 2.87, N = 2 SE +/- 2.67, N = 2 29.66 28.96
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream a b 15 30 45 60 75 SE +/- 6.60, N = 2 SE +/- 6.42, N = 2 68.05 69.62
dav1d Video Input: Chimera 1080p OpenBenchmarking.org FPS, More Is Better dav1d 1.2.1 Video Input: Chimera 1080p a b 50 100 150 200 250 SE +/- 2.76, N = 2 SE +/- 1.95, N = 2 244.12 249.62 1. (CC) gcc options: -pthread -lm
Liquid-DSP Threads: 8 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 8 - Buffer Length: 256 - Filter Length: 512 a b 8M 16M 24M 32M 40M SE +/- 3468000.00, N = 2 SE +/- 2570000.00, N = 2 35719000 36523000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: blazeface a b 0.2093 0.4186 0.6279 0.8372 1.0465 SE +/- 0.01, N = 2 SE +/- 0.00, N = 2 0.91 0.93 MIN: 0.78 / MAX: 3.03 MIN: 0.8 / MAX: 3.09 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OSPRay Benchmark: particle_volume/scivis/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: particle_volume/scivis/real_time a b 0.3091 0.6182 0.9273 1.2364 1.5455 SE +/- 0.00084, N = 2 SE +/- 0.00287, N = 2 1.37399 1.34467
Memcached Set To Get Ratio: 1:100 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.19 Set To Get Ratio: 1:100 a b 110K 220K 330K 440K 550K SE +/- 10780.90, N = 2 SE +/- 14781.65, N = 2 496309.76 485922.63 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: blazeface a b 0.216 0.432 0.648 0.864 1.08 SE +/- 0.00, N = 2 SE +/- 0.01, N = 2 0.94 0.96 MIN: 0.86 / MAX: 3.07 MIN: 0.8 / MAX: 3.11 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 12 - Input: Bosphorus 1080p a b 30 60 90 120 150 SE +/- 0.24, N = 2 SE +/- 0.70, N = 2 154.60 157.69 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Embree Binary: Pathtracer - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer - Model: Crown a b 0.7924 1.5848 2.3772 3.1696 3.962 SE +/- 0.0068, N = 2 SE +/- 0.0668, N = 2 3.5218 3.4542 MIN: 2.81 / MAX: 4.3 MIN: 2.82 / MAX: 4.4
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 13 - Input: Bosphorus 1080p a b 50 100 150 200 250 SE +/- 0.66, N = 2 SE +/- 0.84, N = 2 202.85 206.77 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
NCNN Target: Vulkan GPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: vision_transformer a b 50 100 150 200 250 SE +/- 3.20, N = 2 SE +/- 3.18, N = 2 240.03 235.51 MIN: 196.33 / MAX: 300.62 MIN: 187.15 / MAX: 291.28 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream a b 16 32 48 64 80 SE +/- 5.54, N = 2 SE +/- 6.64, N = 2 69.56 70.82
Stress-NG Test: Vector Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Vector Math a b 3K 6K 9K 12K 15K SE +/- 425.64, N = 2 SE +/- 913.67, N = 2 12666.19 12444.33 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream a b 7 14 21 28 35 SE +/- 2.30, N = 2 SE +/- 2.67, N = 2 28.92 28.46
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: vgg16 a b 20 40 60 80 100 SE +/- 0.23, N = 2 SE +/- 0.11, N = 2 97.70 96.26 MIN: 94.39 / MAX: 117.31 MIN: 93.43 / MAX: 114.68 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OSPRay Benchmark: gravity_spheres_volume/dim_512/ao/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: gravity_spheres_volume/dim_512/ao/real_time a b 0.1677 0.3354 0.5031 0.6708 0.8385 SE +/- 0.009807, N = 2 SE +/- 0.008961, N = 2 0.745146 0.734783
Apache IoTDB Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 200 OpenBenchmarking.org Average Latency, More Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 200 a b 5 10 15 20 25 22.62 22.93 MAX: 1174.88 MAX: 1216.89
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: vgg16 a b 20 40 60 80 100 SE +/- 0.26, N = 2 SE +/- 2.34, N = 2 97.42 98.75 MIN: 94.57 / MAX: 115.91 MIN: 90.66 / MAX: 1242.52 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Apache IoTDB Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 500 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 500 a b 200K 400K 600K 800K 1000K 996242.81 1009840.28
Memcached Set To Get Ratio: 1:5 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.19 Set To Get Ratio: 1:5 a b 110K 220K 330K 440K 550K SE +/- 12286.93, N = 2 SE +/- 13966.62, N = 2 525368.39 518308.20 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: resnet18 a b 3 6 9 12 15 SE +/- 0.04, N = 2 SE +/- 0.04, N = 2 12.73 12.56 MIN: 11.94 / MAX: 28.78 MIN: 11.83 / MAX: 29.03 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: efficientnet-b0 a b 3 6 9 12 15 SE +/- 0.07, N = 2 SE +/- 0.02, N = 2 9.22 9.34 MIN: 8.52 / MAX: 24.96 MIN: 8.61 / MAX: 25.62 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Timed LLVM Compilation Build System: Unix Makefiles OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 16.0 Build System: Unix Makefiles a b 600 1200 1800 2400 3000 SE +/- 45.61, N = 2 SE +/- 7.61, N = 2 2967.53 2929.91
Z3 Theorem Prover SMT File: 1.smt2 OpenBenchmarking.org Seconds, Fewer Is Better Z3 Theorem Prover 4.12.1 SMT File: 1.smt2 a b 10 20 30 40 50 SE +/- 0.05, N = 2 SE +/- 0.18, N = 2 43.97 43.42 1. (CXX) g++ options: -lpthread -std=c++17 -fvisibility=hidden -mfpmath=sse -msse -msse2 -O3 -fPIC
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: resnet50 a b 8 16 24 32 40 SE +/- 0.55, N = 2 SE +/- 1.36, N = 2 33.18 33.60 MIN: 31.12 / MAX: 58.5 MIN: 30.7 / MAX: 59.9 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Apache IoTDB Device Count: 200 - Batch Size Per Write: 100 - Sensor Count: 200 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 200 - Batch Size Per Write: 100 - Sensor Count: 200 a b 3M 6M 9M 12M 15M 12613348.93 12464073.00
Stress-NG Test: Matrix Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Matrix Math a b 4K 8K 12K 16K 20K SE +/- 1025.15, N = 2 SE +/- 854.63, N = 2 17318.70 17116.79 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer ISPC - Model: Crown a b 0.8408 1.6816 2.5224 3.3632 4.204 SE +/- 0.0660, N = 2 SE +/- 0.1040, N = 2 3.7369 3.6962 MIN: 3.08 / MAX: 4.7 MIN: 3.06 / MAX: 4.78
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream a b 200 400 600 800 1000 SE +/- 75.66, N = 2 SE +/- 86.04, N = 2 855.68 864.99
QuantLib OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.30 a b 500 1000 1500 2000 2500 SE +/- 210.20, N = 2 SE +/- 234.25, N = 2 2379.8 2404.8 1. (CXX) g++ options: -O3 -march=native -fPIE -pie
Liquid-DSP Threads: 2 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 2 - Buffer Length: 256 - Filter Length: 512 a b 3M 6M 9M 12M 15M SE +/- 158500.00, N = 2 SE +/- 149500.00, N = 2 15179500 15023500 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Apache IoTDB Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 500 OpenBenchmarking.org Average Latency, More Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 500 a b 8 16 24 32 40 34.81 34.48 MAX: 1426.89 MAX: 1495.07
dav1d Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better dav1d 1.2.1 Video Input: Summer Nature 4K a b 15 30 45 60 75 SE +/- 0.31, N = 2 SE +/- 2.35, N = 2 67.01 66.38 1. (CC) gcc options: -pthread -lm
Apache IoTDB Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 200 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 100 - Batch Size Per Write: 1 - Sensor Count: 200 a b 110K 220K 330K 440K 550K 529599.00 524660.18
Stress-NG Test: Memory Copying OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Memory Copying a b 200 400 600 800 1000 SE +/- 40.86, N = 2 SE +/- 40.05, N = 2 1078.73 1088.75 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
Apache IoTDB Device Count: 200 - Batch Size Per Write: 100 - Sensor Count: 200 OpenBenchmarking.org Average Latency, More Is Better Apache IoTDB 1.1.2 Device Count: 200 - Batch Size Per Write: 100 - Sensor Count: 200 a b 30 60 90 120 150 144.53 143.24 MAX: 1992.27 MAX: 2073.51
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: regnety_400m a b 3 6 9 12 15 SE +/- 0.14, N = 2 SE +/- 0.04, N = 2 10.03 10.12 MIN: 8.93 / MAX: 25.51 MIN: 9.48 / MAX: 25.78 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Timed Godot Game Engine Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Godot Game Engine Compilation 4.0 Time To Compile a b 300 600 900 1200 1500 SE +/- 0.29, N = 2 SE +/- 12.96, N = 2 1372.60 1384.77
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b 3 6 9 12 15 SE +/- 1.04, N = 2 SE +/- 1.15, N = 2 10.25 10.34
VVenC Video Input: Bosphorus 4K - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.9 Video Input: Bosphorus 4K - Video Preset: Fast a b 0.2655 0.531 0.7965 1.062 1.3275 SE +/- 0.013, N = 2 SE +/- 0.013, N = 2 1.180 1.170 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream a b 30 60 90 120 150 SE +/- 15.16, N = 2 SE +/- 17.32, N = 2 137.95 139.05
VVenC Video Input: Bosphorus 4K - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.9 Video Input: Bosphorus 4K - Video Preset: Faster a b 0.6059 1.2118 1.8177 2.4236 3.0295 SE +/- 0.052, N = 2 SE +/- 0.045, N = 2 2.693 2.672 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
NCNN Target: CPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: vision_transformer a b 50 100 150 200 250 SE +/- 0.17, N = 2 SE +/- 5.04, N = 2 233.96 232.17 MIN: 196.52 / MAX: 292.9 MIN: 187.55 / MAX: 291.61 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream a b 0.5294 1.0588 1.5882 2.1176 2.647 SE +/- 0.2053, N = 2 SE +/- 0.2323, N = 2 2.3527 2.3352
Stress-NG Test: Wide Vector Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Wide Vector Math a b 30K 60K 90K 120K 150K SE +/- 2993.76, N = 2 SE +/- 3329.63, N = 2 147314.85 148388.79 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
Embree Binary: Pathtracer - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer - Model: Asian Dragon Obj a b 0.8663 1.7326 2.5989 3.4652 4.3315 SE +/- 0.0521, N = 2 SE +/- 0.0122, N = 2 3.8229 3.8503 MIN: 3.19 / MAX: 4.76 MIN: 3.22 / MAX: 4.83
Stress-NG Test: Semaphores OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Semaphores a b 700K 1400K 2100K 2800K 3500K SE +/- 287315.12, N = 2 SE +/- 70160.30, N = 2 3218725.51 3241408.21 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b 40 80 120 160 200 SE +/- 20.06, N = 2 SE +/- 21.75, N = 2 196.78 195.49
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 1 - Buffer Length: 256 - Filter Length: 32 a b 10M 20M 30M 40M 50M SE +/- 451500.00, N = 2 SE +/- 77000.00, N = 2 46516500 46825000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
libxsmm M N K: 32 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 32 a b 11 22 33 44 55 SE +/- 0.35, N = 2 SE +/- 0.55, N = 2 47.7 48.0 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -march=core-avx2
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: googlenet a b 4 8 12 16 20 SE +/- 0.00, N = 2 SE +/- 0.01, N = 2 16.29 16.19 MIN: 14.94 / MAX: 33.61 MIN: 14.91 / MAX: 32.9 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Timed LLVM Compilation Build System: Ninja OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 16.0 Build System: Ninja a b 600 1200 1800 2400 3000 SE +/- 4.75, N = 2 SE +/- 4.95, N = 2 2892.09 2908.88
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: mnasnet a b 1.1835 2.367 3.5505 4.734 5.9175 SE +/- 0.81, N = 2 SE +/- 0.78, N = 2 5.23 5.26 MIN: 4.05 / MAX: 26.68 MIN: 4.06 / MAX: 22.06 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 8 - Input: Bosphorus 4K a b 3 6 9 12 15 SE +/- 0.307, N = 2 SE +/- 1.160, N = 2 9.490 9.436 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: yolov4-tiny a b 9 18 27 36 45 SE +/- 1.02, N = 2 SE +/- 1.11, N = 2 39.39 39.17 MIN: 36.4 / MAX: 60.27 MIN: 36.39 / MAX: 59.83 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Xonotic Resolution: 1920 x 1080 - Effects Quality: Ultra OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 1920 x 1080 - Effects Quality: Ultra a b 20 40 60 80 100 SE +/- 0.16, N = 2 SE +/- 0.41, N = 2 77.93 77.52 MIN: 34 / MAX: 120 MIN: 32 / MAX: 120
Stress-NG Test: Socket Activity OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Socket Activity a b 500 1000 1500 2000 2500 SE +/- 249.85, N = 2 SE +/- 271.08, N = 2 2495.57 2508.63 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer ISPC - Model: Asian Dragon a b 1.0774 2.1548 3.2322 4.3096 5.387 SE +/- 0.0697, N = 2 SE +/- 0.0364, N = 2 4.7883 4.7636 MIN: 4.05 / MAX: 5.89 MIN: 4.05 / MAX: 5.98
Apache IoTDB Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 200 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 200 a b 200K 400K 600K 800K 1000K 816344.20 812170.86
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream a b 4 8 12 16 20 SE +/- 1.61, N = 2 SE +/- 1.82, N = 2 14.67 14.61
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream a b 50 100 150 200 250 SE +/- 31.56, N = 2 SE +/- 32.45, N = 2 244.56 245.64
Embree Binary: Pathtracer - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer - Model: Asian Dragon a b 0.9491 1.8982 2.8473 3.7964 4.7455 SE +/- 0.0414, N = 2 SE +/- 0.0239, N = 2 4.2183 4.1998 MIN: 3.54 / MAX: 5.31 MIN: 3.54 / MAX: 5.43
Apache IoTDB Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 200 OpenBenchmarking.org point/sec, More Is Better Apache IoTDB 1.1.2 Device Count: 500 - Batch Size Per Write: 1 - Sensor Count: 200 a b 200K 400K 600K 800K 1000K 1036429.93 1040683.13
Xonotic Resolution: 1920 x 1080 - Effects Quality: Ultimate OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 1920 x 1080 - Effects Quality: Ultimate a b 13 26 39 52 65 SE +/- 0.07, N = 2 SE +/- 0.06, N = 2 59.17 58.95 MIN: 24 / MAX: 94 MIN: 24 / MAX: 94
Z3 Theorem Prover SMT File: 2.smt2 OpenBenchmarking.org Seconds, Fewer Is Better Z3 Theorem Prover 4.12.1 SMT File: 2.smt2 a b 30 60 90 120 150 SE +/- 0.11, N = 2 SE +/- 0.05, N = 2 130.94 131.41 1. (CXX) g++ options: -lpthread -std=c++17 -fvisibility=hidden -mfpmath=sse -msse -msse2 -O3 -fPIC
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: alexnet a b 3 6 9 12 15 SE +/- 0.06, N = 2 SE +/- 0.07, N = 2 11.68 11.64 MIN: 10.92 / MAX: 27.91 MIN: 10.99 / MAX: 27.75 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Apache IoTDB Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 500 OpenBenchmarking.org Average Latency, More Is Better Apache IoTDB 1.1.2 Device Count: 200 - Batch Size Per Write: 1 - Sensor Count: 500 a b 7 14 21 28 35 29.65 29.75 MAX: 1193.78 MAX: 1271.26
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: shufflenet-v2 a b 0.8258 1.6516 2.4774 3.3032 4.129 SE +/- 0.63, N = 2 SE +/- 0.62, N = 2 3.66 3.67 MIN: 2.74 / MAX: 16.28 MIN: 2.75 / MAX: 19.08 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: googlenet a b 4 8 12 16 20 SE +/- 0.08, N = 2 SE +/- 0.10, N = 2 16.23 16.19 MIN: 15.01 / MAX: 32.85 MIN: 14.97 / MAX: 32.25 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Liquid-DSP Threads: 2 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 2 - Buffer Length: 256 - Filter Length: 57 a b 16M 32M 48M 64M 80M SE +/- 128500.00, N = 2 SE +/- 620500.00, N = 2 73795500 73620500 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 4 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 4 - Buffer Length: 256 - Filter Length: 32 a b 30M 60M 90M 120M 150M SE +/- 2650000.00, N = 2 SE +/- 2610000.00, N = 2 140280000 139950000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Xonotic Resolution: 1920 x 1080 - Effects Quality: High OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 1920 x 1080 - Effects Quality: High a b 20 40 60 80 100 SE +/- 0.04, N = 2 SE +/- 0.32, N = 2 91.11 90.91 MIN: 39 / MAX: 130 MIN: 38 / MAX: 130
dav1d Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better dav1d 1.2.1 Video Input: Chimera 1080p 10-bit a b 40 80 120 160 200 SE +/- 19.41, N = 2 SE +/- 12.81, N = 2 191.30 190.87 1. (CC) gcc options: -pthread -lm
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream a b 2 4 6 8 10 SE +/- 1.0816, N = 2 SE +/- 1.0939, N = 2 8.3009 8.2832
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: alexnet a b 3 6 9 12 15 SE +/- 0.03, N = 2 SE +/- 0.04, N = 2 11.67 11.69 MIN: 11.06 / MAX: 27.75 MIN: 10.95 / MAX: 28.31 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 1 - Buffer Length: 256 - Filter Length: 512 a b 1.7M 3.4M 5.1M 6.8M 8.5M SE +/- 8100.00, N = 2 SE +/- 20850.00, N = 2 7721800 7710850 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
VVenC Video Input: Bosphorus 1080p - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.9 Video Input: Bosphorus 1080p - Video Preset: Faster a b 3 6 9 12 15 SE +/- 0.090, N = 2 SE +/- 0.603, N = 2 9.882 9.868 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
Liquid-DSP Threads: 2 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 2 - Buffer Length: 256 - Filter Length: 32 a b 20M 40M 60M 80M 100M SE +/- 873500.00, N = 2 SE +/- 1161000.00, N = 2 85550500 85664000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
libxsmm M N K: 64 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 64 a b 20 40 60 80 100 SE +/- 4.45, N = 2 SE +/- 4.80, N = 2 90.9 90.8 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -march=core-avx2
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b 15 30 45 60 75 SE +/- 0.37, N = 2 SE +/- 0.29, N = 2 65.43 65.36
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b 7 14 21 28 35 SE +/- 0.17, N = 2 SE +/- 0.13, N = 2 30.54 30.57
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.4 WAV To Opus Encode a b 8 16 24 32 40 SE +/- 0.10, N = 2 SE +/- 0.06, N = 2 35.40 35.43 1. (CXX) g++ options: -O3 -fvisibility=hidden -logg -lm
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: resnet18 a b 3 6 9 12 15 SE +/- 0.04, N = 2 SE +/- 0.11, N = 2 12.61 12.62 MIN: 11.89 / MAX: 29.06 MIN: 11.83 / MAX: 28.8 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Xonotic Resolution: 1920 x 1080 - Effects Quality: Low OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.6 Resolution: 1920 x 1080 - Effects Quality: Low a b 50 100 150 200 250 SE +/- 0.72, N = 2 SE +/- 0.60, N = 2 205.61 205.45 MIN: 99 / MAX: 336 MIN: 95 / MAX: 334
Liquid-DSP Threads: 4 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 4 - Buffer Length: 256 - Filter Length: 512 a b 6M 12M 18M 24M 30M SE +/- 466500.00, N = 2 SE +/- 478500.00, N = 2 25830500 25812500 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Apache Cassandra Test: Writes OpenBenchmarking.org Op/s, More Is Better Apache Cassandra 4.1.3 Test: Writes a b 6K 12K 18K 24K 30K SE +/- 149.00, N = 2 SE +/- 71.50, N = 2 26422 26438
Stress-NG Test: Vector Shuffle OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Vector Shuffle a b 500 1000 1500 2000 2500 SE +/- 33.39, N = 2 SE +/- 17.07, N = 2 2431.22 2431.01 1. (CXX) g++ options: -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -lEGL -lgbm -lGLESv2 -ljpeg -lpthread -lrt -lsctp -lz
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 1 - Buffer Length: 256 - Filter Length: 57 a b 9M 18M 27M 36M 45M SE +/- 225000.00, N = 2 SE +/- 154500.00, N = 2 43429000 43427500 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 a b 1.1498 2.2996 3.4494 4.5992 5.749 SE +/- 0.77, N = 2 SE +/- 0.80, N = 2 5.11 5.11 MIN: 4.07 / MAX: 20.14 MIN: 4.05 / MAX: 26.7 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 a b 2 4 6 8 10 SE +/- 0.67, N = 2 SE +/- 0.65, N = 2 6.67 6.67 MIN: 5.54 / MAX: 27.83 MIN: 5.58 / MAX: 27.94 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Intel Open Image Denoise Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.0 Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only a b 0.0135 0.027 0.0405 0.054 0.0675 SE +/- 0.00, N = 2 SE +/- 0.00, N = 2 0.06 0.06
Intel Open Image Denoise Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.0 Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only a b 0.027 0.054 0.081 0.108 0.135 SE +/- 0.00, N = 2 SE +/- 0.00, N = 2 0.12 0.12
Intel Open Image Denoise Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.0 Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only a b 0.027 0.054 0.081 0.108 0.135 SE +/- 0.00, N = 2 SE +/- 0.00, N = 2 0.12 0.12
vkpeak fp32-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20230730 fp32-scalar a b 60 120 180 240 300 SE +/- 0.00, N = 2 SE +/- 0.01, N = 2 268.33 268.33
Phoronix Test Suite v10.8.5