ddddx AMD Ryzen Threadripper PRO 5965WX 24-Cores testing with a ASUS Pro WS WRX80E-SAGE SE WIFI (1201 BIOS) and ASUS NVIDIA NV106 2GB on Ubuntu 23.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2403218-NE-DDDDX513530&gru&sor .
ddddx Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution a b c d AMD Ryzen Threadripper PRO 5965WX 24-Cores @ 3.80GHz (24 Cores / 48 Threads) ASUS Pro WS WRX80E-SAGE SE WIFI (1201 BIOS) AMD Starship/Matisse 8 x 16GB DDR4-2133MT/s Corsair CMK32GX4M2E3200C16 2048GB SOLIDIGM SSDPFKKW020X7 ASUS NVIDIA NV106 2GB AMD Starship/Matisse VA2431 2 x Intel X550 + Intel Wi-Fi 6 AX200 Ubuntu 23.10 6.5.0-15-generic (x86_64) GNOME Shell 45.0 X Server + Wayland nouveau 4.3 Mesa 23.2.1-1ubuntu3 GCC 13.2.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa008205 Python Details - Python 3.11.6 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET no microcode + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
ddddx openvino: Face Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP32 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Person Re-Identification Retail FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU svt-av1: Preset 4 - Bosphorus 4K svt-av1: Preset 8 - Bosphorus 4K svt-av1: Preset 12 - Bosphorus 4K svt-av1: Preset 13 - Bosphorus 4K svt-av1: Preset 4 - Bosphorus 1080p svt-av1: Preset 8 - Bosphorus 1080p svt-av1: Preset 12 - Bosphorus 1080p svt-av1: Preset 13 - Bosphorus 1080p vvenc: Bosphorus 4K - Fast vvenc: Bosphorus 4K - Faster vvenc: Bosphorus 1080p - Fast vvenc: Bosphorus 1080p - Faster ospray: particle_volume/ao/real_time ospray: particle_volume/scivis/real_time ospray: particle_volume/pathtracer/real_time ospray: gravity_spheres_volume/dim_512/ao/real_time ospray: gravity_spheres_volume/dim_512/scivis/real_time ospray: gravity_spheres_volume/dim_512/pathtracer/real_time deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Synchronous Single-Stream deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream deepsparse: ResNet-50, Sparse INT8 - Synchronous Single-Stream deepsparse: Llama2 Chat 7b Quantized - Asynchronous Multi-Stream deepsparse: Llama2 Chat 7b Quantized - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream srsran: PDSCH Processor Benchmark, Throughput Total srsran: PUSCH Processor Benchmark, Throughput Total srsran: PDSCH Processor Benchmark, Throughput Thread srsran: PUSCH Processor Benchmark, Throughput Thread jpegxl: PNG - 80 jpegxl: PNG - 90 jpegxl: JPEG - 80 jpegxl: JPEG - 90 jpegxl: PNG - 100 jpegxl: JPEG - 100 jpegxl-decode: 1 jpegxl-decode: All stockfish: Chess Benchmark rocksdb: Overwrite rocksdb: Rand Fill rocksdb: Rand Read rocksdb: Update Rand rocksdb: Seq Fill rocksdb: Rand Fill Sync rocksdb: Read While Writing rocksdb: Read Rand Write Rand brl-cad: VGR Performance Metric v-ray: CPU onednn: IP Shapes 1D - CPU onednn: IP Shapes 3D - CPU onednn: Convolution Batch Shapes Auto - CPU onednn: Deconvolution Batch shapes_1d - CPU onednn: Deconvolution Batch shapes_3d - CPU onednn: Recurrent Neural Network Training - CPU onednn: Recurrent Neural Network Inference - CPU ospray-studio: 1 - 4K - 1 - Path Tracer - CPU ospray-studio: 2 - 4K - 1 - Path Tracer - CPU ospray-studio: 3 - 4K - 1 - Path Tracer - CPU ospray-studio: 1 - 4K - 16 - Path Tracer - CPU ospray-studio: 1 - 4K - 32 - Path Tracer - CPU ospray-studio: 2 - 4K - 16 - Path Tracer - CPU ospray-studio: 2 - 4K - 32 - Path Tracer - CPU ospray-studio: 3 - 4K - 16 - Path Tracer - CPU ospray-studio: 3 - 4K - 32 - Path Tracer - CPU ospray-studio: 1 - 1080p - 1 - Path Tracer - CPU ospray-studio: 2 - 1080p - 1 - Path Tracer - CPU ospray-studio: 3 - 1080p - 1 - Path Tracer - CPU ospray-studio: 1 - 1080p - 16 - Path Tracer - CPU ospray-studio: 1 - 1080p - 32 - Path Tracer - CPU ospray-studio: 2 - 1080p - 16 - Path Tracer - CPU ospray-studio: 2 - 1080p - 32 - Path Tracer - CPU ospray-studio: 3 - 1080p - 16 - Path Tracer - CPU ospray-studio: 3 - 1080p - 32 - Path Tracer - CPU draco: Lion draco: Church Facade openvino: Face Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP32 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Person Re-Identification Retail FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Synchronous Single-Stream deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream deepsparse: ResNet-50, Sparse INT8 - Synchronous Single-Stream deepsparse: Llama2 Chat 7b Quantized - Asynchronous Multi-Stream deepsparse: Llama2 Chat 7b Quantized - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream build-linux-kernel: defconfig build-linux-kernel: allmodconfig compress-pbzip2: FreeBSD-13.0-RELEASE-amd64-memstick.img Compression primesieve: 1e12 primesieve: 1e13 encode-wavpack: WAV To WavPack a b c d 7.6 70.01 69.73 600.94 16.66 2192.09 170.1 1104.03 718.2 3224.64 428.43 88.13 1652.05 895.25 1033.67 377.84 1198.64 24478.31 421.72 44518.2 6.647 62.954 154.17 151.958 19.206 131.721 489.431 587.48 7.06 14.566 19.624 41.283 10.2835 10.1386 156.396 4.90257 4.59882 7.46614 26.6926 18.372 684.5651 194.2938 306.9974 156.2099 1994.389 798.4747 1.8772 6.0448 306.9702 156.4322 150.8283 99.3344 224.1641 112.3308 30.4683 21.7009 334.3887 75.876 26.7411 18.4908 11710.3 1888.2 603.7 177.6 44.748 39.403 46.584 42.502 27.65 27.528 64.032 483.415 52607528 781166 790603 145962891 671121 920430 45528 5546801 2879787 430387 44287 1.31474 3.52569 2.73804 5.51082 2.36511 1254.68 638.128 4528 4616 5330 78061 150375 78788 152653 90362 177445 1138 1158 1336 18203 41167 18543 41839 21449 47495 5328 7023 1558.81 171.22 171.92 19.94 715.22 5.46 70.48 10.86 16.69 3.71 27.99 136 14.52 13.38 11.54 63.46 10 0.97 56.87 0.53 448.7091 54.4167 17.5108 5.1435 39.0535 6.3927 6.003 1.2491 5873.2701 165.3999 39.0528 6.3846 79.4548 10.0615 53.4812 8.8921 393.6606 46.0582 35.8509 13.1742 446.8406 54.0673 54.133 597.034 3.230305 6.244 77.148 4.433 7.68 69.95 70.01 598.03 16.73 2197.5 170.68 1106.44 719.93 3241.35 428.18 88.33 1654.72 889.36 1046.3 380.22 1201.22 24434.51 426.03 44461.02 6.787 63.893 150.809 145.97 18.916 131.928 499.14 602.968 7.075 14.823 19.489 40.461 10.2466 10.1433 155.373 4.88053 4.57706 7.44837 26.6035 18.509 685.4489 191.7617 307.0188 155.3413 2032.9567 800.0474 1.8677 6.0393 307.0893 155.3135 151.9811 99.2228 224.1971 111.8371 30.4851 21.7471 337.4525 76.1222 26.8554 18.5232 12063.4 1889.2 600.2 178.7 43.245 37.44 42.66 39.892 27.572 27.323 63.301 482.577 61008270 779236 783967 145893753 666715 908177 45974 5586539 2859339 424641 44634 1.31101 3.52584 2.71135 5.33929 2.34602 1255.67 636.768 4521 4609 5342 76995 150040 78304 153552 90131 175496 1138 1150 1338 18181 41125 18471 41598 21398 47827 5365 7034 1547.31 171.43 171.16 20.04 713.72 5.45 70.23 10.83 16.65 3.69 28 135.7 14.49 13.47 11.41 63.08 9.98 0.97 56.3 0.53 447.6535 54.0139 17.4858 5.2115 39.0501 6.4286 5.8898 1.2468 5904.2788 165.5519 39.0344 6.4293 78.8208 10.0731 53.4714 8.9317 393.0761 45.9601 35.5391 13.1312 446.2375 53.9731 54.208 597.898 3.314749 6.081 76.918 4.438 7.59 69.83 69.99 592.96 16.66 2197.98 169.65 1107.64 715.28 3228.19 427.02 88.05 1647.13 894.57 1041.67 378.34 1202.09 24329.21 423.46 44306.10 6.711 62.498 150.133 150.522 19.170 133.270 487.658 606.360 7.066 14.777 19.460 40.915 10.2804 10.1479 155.833 4.88608 4.59266 7.43821 26.6165 18.4018 684.0159 192.9770 306.9159 155.6803 1998.6073 799.4151 1.8705 6.0425 306.9464 155.7940 151.2559 98.8001 223.0028 112.0022 30.3866 21.6925 334.7472 76.0575 26.6956 18.4866 11723.5 1886.9 605.5 178.6 39.904 38.511 42.353 40.761 27.487 27.321 63.038 470.159 55237573 777804 774384 144921763 660978 908882 46701 5592760 2868681 420528 44375 1.32743 3.53323 2.73192 5.37367 2.36138 1256.43 637.173 4535 4615 5341 77511 150381 78919 152510 90083 175767 1138 1152 1333 18229 41189 18579 42050 21409 47686 5363 7092 1563.15 171.67 171.29 20.21 716.16 5.45 70.66 10.82 16.76 3.71 28.08 136.15 14.56 13.39 11.46 63.40 9.97 0.98 56.64 0.53 449.1187 54.3292 17.5229 5.1787 39.0664 6.4155 5.9888 1.2478 5896.5933 165.4651 39.0572 6.4103 79.2452 10.1159 53.7553 8.9189 393.8206 46.0763 35.8092 13.1431 447.5829 54.0787 52.812 596.506 3.285082 6.113 77.214 4.435 7.64 69.61 69.86 603.96 16.74 2198.11 169.41 1103.6 716.84 3230.25 428.55 88.14 1652.91 894.41 1043.82 377.49 1202.22 24496.84 421.54 44596.35 6.66 62.931 151.638 152.466 19.318 133.784 488.521 587.971 7.062 14.859 19.529 41 10.2985 10.1833 155.049 4.89928 4.58081 7.4522 26.6941 18.3978 685.8159 194.384 306.9436 156.0415 2014.5766 794.3902 1.8701 6.0373 306.8997 155.8985 150.7344 99.0547 223.554 111.4506 30.488 21.7206 334.7181 75.6991 26.8079 18.479 11654.9 1857 607.8 177.9 42.176 38.265 43.502 39.595 27.338 27.319 62.817 439.613 53082452 762175 777115 145516874 658520 919148 46299 5633613 2814513 425778 44633 1.31333 3.53851 2.73352 5.03641 2.34346 1254.43 642.194 4528 4618 5331 77073 150694 78354 151620 90380 175129 1134 1153 1330 18177 41111 18488 41959 21385 47597 5347 7096 1553.54 172.18 171.54 19.84 713.64 5.45 70.76 10.86 16.72 3.71 27.98 135.96 14.51 13.4 11.44 63.54 9.97 0.97 56.9 0.53 448.927 54.3402 17.476 5.1408 39.0595 6.4001 5.9417 1.2559 5897.2791 165.6061 39.0746 6.4059 79.5069 10.0902 53.6259 8.9625 393.394 46.0166 35.8135 13.2048 447.1743 54.1018 54.163 597.827 3.221993 6.123 77.06 4.442 OpenBenchmarking.org
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection FP16 - Device: CPU b d a c 2 4 6 8 10 SE +/- 0.01, N = 3 7.68 7.64 7.60 7.59 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Detection FP16 - Device: CPU a b c d 16 32 48 64 80 SE +/- 0.04, N = 3 70.01 69.95 69.83 69.61 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Detection FP32 - Device: CPU b c d a 16 32 48 64 80 SE +/- 0.06, N = 3 70.01 69.99 69.86 69.73 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16 - Device: CPU d a b c 130 260 390 520 650 SE +/- 0.39, N = 3 603.96 600.94 598.03 592.96 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection FP16-INT8 - Device: CPU d b c a 4 8 12 16 20 SE +/- 0.01, N = 3 16.74 16.73 16.66 16.66 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16 - Device: CPU d c b a 500 1000 1500 2000 2500 SE +/- 4.26, N = 3 2198.11 2197.98 2197.50 2192.09 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16 - Device: CPU b a c d 40 80 120 160 200 SE +/- 0.04, N = 3 170.68 170.10 169.65 169.41 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16-INT8 - Device: CPU c b a d 200 400 600 800 1000 SE +/- 1.89, N = 3 1107.64 1106.44 1104.03 1103.60 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16 - Device: CPU b a d c 160 320 480 640 800 SE +/- 0.61, N = 3 719.93 718.20 716.84 715.28 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16-INT8 - Device: CPU b d c a 700 1400 2100 2800 3500 SE +/- 3.42, N = 3 3241.35 3230.25 3228.19 3224.64 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU d a b c 90 180 270 360 450 SE +/- 0.36, N = 3 428.55 428.43 428.18 427.02 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Machine Translation EN To DE FP16 - Device: CPU b d a c 20 40 60 80 100 SE +/- 0.03, N = 3 88.33 88.14 88.13 88.05 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16-INT8 - Device: CPU b d a c 400 800 1200 1600 2000 SE +/- 0.87, N = 3 1654.72 1652.91 1652.05 1647.13 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Vehicle Bike Detection FP16 - Device: CPU a c d b 200 400 600 800 1000 SE +/- 3.73, N = 3 895.25 894.57 894.41 889.36 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Noise Suppression Poconet-Like FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Noise Suppression Poconet-Like FP16 - Device: CPU b d c a 200 400 600 800 1000 SE +/- 3.03, N = 3 1046.30 1043.82 1041.67 1033.67 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16 - Device: CPU b c a d 80 160 240 320 400 SE +/- 0.65, N = 3 380.22 378.34 377.84 377.49 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Re-Identification Retail FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Re-Identification Retail FP16 - Device: CPU d c b a 300 600 900 1200 1500 SE +/- 0.87, N = 3 1202.22 1202.09 1201.22 1198.64 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU d a b c 5K 10K 15K 20K 25K SE +/- 23.00, N = 3 24496.84 24478.31 24434.51 24329.21 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16-INT8 - Device: CPU b c a d 90 180 270 360 450 SE +/- 0.10, N = 3 426.03 423.46 421.72 421.54 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU d a b c 10K 20K 30K 40K 50K SE +/- 27.92, N = 3 44596.35 44518.20 44461.02 44306.10 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 4 - Input: Bosphorus 4K b c d a 2 4 6 8 10 SE +/- 0.005, N = 3 6.787 6.711 6.660 6.647 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 8 - Input: Bosphorus 4K b a d c 14 28 42 56 70 SE +/- 0.26, N = 3 63.89 62.95 62.93 62.50 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 12 - Input: Bosphorus 4K a d b c 30 60 90 120 150 SE +/- 0.99, N = 3 154.17 151.64 150.81 150.13 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 13 - Input: Bosphorus 4K d a c b 30 60 90 120 150 SE +/- 1.34, N = 3 152.47 151.96 150.52 145.97 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 4 - Input: Bosphorus 1080p d a c b 5 10 15 20 25 SE +/- 0.04, N = 3 19.32 19.21 19.17 18.92 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 8 - Input: Bosphorus 1080p d c b a 30 60 90 120 150 SE +/- 0.72, N = 3 133.78 133.27 131.93 131.72 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 12 - Input: Bosphorus 1080p b a d c 110 220 330 440 550 SE +/- 6.37, N = 3 499.14 489.43 488.52 487.66 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 13 - Input: Bosphorus 1080p c b d a 130 260 390 520 650 SE +/- 4.87, N = 3 606.36 602.97 587.97 587.48 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
VVenC Video Input: Bosphorus 4K - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.11 Video Input: Bosphorus 4K - Video Preset: Fast b c d a 2 4 6 8 10 SE +/- 0.031, N = 3 7.075 7.066 7.062 7.060 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
VVenC Video Input: Bosphorus 4K - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.11 Video Input: Bosphorus 4K - Video Preset: Faster d b c a 4 8 12 16 20 SE +/- 0.03, N = 3 14.86 14.82 14.78 14.57 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
VVenC Video Input: Bosphorus 1080p - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.11 Video Input: Bosphorus 1080p - Video Preset: Fast a d b c 5 10 15 20 25 SE +/- 0.03, N = 3 19.62 19.53 19.49 19.46 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
VVenC Video Input: Bosphorus 1080p - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.11 Video Input: Bosphorus 1080p - Video Preset: Faster a d c b 9 18 27 36 45 SE +/- 0.15, N = 3 41.28 41.00 40.92 40.46 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
OSPRay Benchmark: particle_volume/ao/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.1 Benchmark: particle_volume/ao/real_time d a c b 3 6 9 12 15 SE +/- 0.01, N = 3 10.30 10.28 10.28 10.25
OSPRay Benchmark: particle_volume/scivis/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.1 Benchmark: particle_volume/scivis/real_time d c b a 3 6 9 12 15 SE +/- 0.01, N = 3 10.18 10.15 10.14 10.14
OSPRay Benchmark: particle_volume/pathtracer/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.1 Benchmark: particle_volume/pathtracer/real_time a c b d 30 60 90 120 150 SE +/- 0.22, N = 3 156.40 155.83 155.37 155.05
OSPRay Benchmark: gravity_spheres_volume/dim_512/ao/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.1 Benchmark: gravity_spheres_volume/dim_512/ao/real_time a d c b 1.1031 2.2062 3.3093 4.4124 5.5155 SE +/- 0.01074, N = 3 4.90257 4.89928 4.88608 4.88053
OSPRay Benchmark: gravity_spheres_volume/dim_512/scivis/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.1 Benchmark: gravity_spheres_volume/dim_512/scivis/real_time a c d b 1.0347 2.0694 3.1041 4.1388 5.1735 SE +/- 0.00963, N = 3 4.59882 4.59266 4.58081 4.57706
OSPRay Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.1 Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time a d b c 2 4 6 8 10 SE +/- 0.00473, N = 3 7.46614 7.45220 7.44837 7.43821
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream d a c b 6 12 18 24 30 SE +/- 0.04, N = 3 26.69 26.69 26.62 26.60
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream b c d a 5 10 15 20 25 SE +/- 0.02, N = 3 18.51 18.40 18.40 18.37
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream d b a c 150 300 450 600 750 SE +/- 0.35, N = 3 685.82 685.45 684.57 684.02
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream d a c b 40 80 120 160 200 SE +/- 0.92, N = 3 194.38 194.29 192.98 191.76
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream b a d c 70 140 210 280 350 SE +/- 0.00, N = 3 307.02 307.00 306.94 306.92
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream a d c b 30 60 90 120 150 SE +/- 0.37, N = 3 156.21 156.04 155.68 155.34
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream b d c a 400 800 1200 1600 2000 SE +/- 6.40, N = 3 2032.96 2014.58 1998.61 1994.39
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream b c a d 200 400 600 800 1000 SE +/- 1.90, N = 3 800.05 799.42 798.47 794.39
Neural Magic DeepSparse Model: Llama2 Chat 7b Quantized - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: Llama2 Chat 7b Quantized - Scenario: Asynchronous Multi-Stream a c d b 0.4224 0.8448 1.2672 1.6896 2.112 SE +/- 0.0031, N = 3 1.8772 1.8705 1.8701 1.8677
Neural Magic DeepSparse Model: Llama2 Chat 7b Quantized - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: Llama2 Chat 7b Quantized - Scenario: Synchronous Single-Stream a c b d 2 4 6 8 10 SE +/- 0.0038, N = 3 6.0448 6.0425 6.0393 6.0373
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream b a c d 70 140 210 280 350 SE +/- 0.03, N = 3 307.09 306.97 306.95 306.90
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream a d c b 30 60 90 120 150 SE +/- 0.15, N = 3 156.43 155.90 155.79 155.31
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream b c a d 30 60 90 120 150 SE +/- 0.31, N = 3 151.98 151.26 150.83 150.73
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream a b d c 20 40 60 80 100 SE +/- 0.05, N = 3 99.33 99.22 99.05 98.80
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream b a d c 50 100 150 200 250 SE +/- 0.20, N = 3 224.20 224.16 223.55 223.00
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream a c b d 30 60 90 120 150 SE +/- 0.07, N = 3 112.33 112.00 111.84 111.45
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream d b a c 7 14 21 28 35 SE +/- 0.09, N = 3 30.49 30.49 30.47 30.39
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream b d a c 5 10 15 20 25 SE +/- 0.01, N = 3 21.75 21.72 21.70 21.69
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream b c d a 70 140 210 280 350 SE +/- 0.53, N = 3 337.45 334.75 334.72 334.39
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream b c a d 20 40 60 80 100 SE +/- 0.14, N = 3 76.12 76.06 75.88 75.70
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream b d a c 6 12 18 24 30 SE +/- 0.03, N = 3 26.86 26.81 26.74 26.70
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream b a c d 5 10 15 20 25 SE +/- 0.01, N = 3 18.52 18.49 18.49 18.48
srsRAN Project Test: PDSCH Processor Benchmark, Throughput Total OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240219 Test: PDSCH Processor Benchmark, Throughput Total b c a d 3K 6K 9K 12K 15K SE +/- 27.21, N = 3 12063.4 11723.5 11710.3 11654.9 1. (CXX) g++ options: -march=native -mavx2 -mavx -msse4.1 -mfma -O3 -fno-trapping-math -fno-math-errno -ldl
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Total OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240219 Test: PUSCH Processor Benchmark, Throughput Total b a c d 400 800 1200 1600 2000 SE +/- 1.62, N = 3 1889.2 1888.2 1886.9 1857.0 MIN: 1137.6 MIN: 1136.3 MIN: 1135 / MAX: 1889.2 MIN: 1145.4 1. (CXX) g++ options: -march=native -mavx2 -mavx -msse4.1 -mfma -O3 -fno-trapping-math -fno-math-errno -ldl
srsRAN Project Test: PDSCH Processor Benchmark, Throughput Thread OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240219 Test: PDSCH Processor Benchmark, Throughput Thread d c a b 130 260 390 520 650 SE +/- 1.71, N = 3 607.8 605.5 603.7 600.2 1. (CXX) g++ options: -march=native -mavx2 -mavx -msse4.1 -mfma -O3 -fno-trapping-math -fno-math-errno -ldl
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Thread OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240219 Test: PUSCH Processor Benchmark, Throughput Thread b c d a 40 80 120 160 200 SE +/- 0.70, N = 3 178.7 178.6 177.9 177.6 MIN: 112.2 MIN: 112.6 / MAX: 179.4 MIN: 110.9 MIN: 113.3 1. (CXX) g++ options: -march=native -mavx2 -mavx -msse4.1 -mfma -O3 -fno-trapping-math -fno-math-errno -ldl
JPEG-XL libjxl Input: PNG - Quality: 80 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: PNG - Quality: 80 a b d c 10 20 30 40 50 SE +/- 0.29, N = 3 44.75 43.25 42.18 39.90 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
JPEG-XL libjxl Input: PNG - Quality: 90 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: PNG - Quality: 90 a c d b 9 18 27 36 45 SE +/- 0.27, N = 15 39.40 38.51 38.27 37.44 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
JPEG-XL libjxl Input: JPEG - Quality: 80 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: JPEG - Quality: 80 a d b c 11 22 33 44 55 SE +/- 0.32, N = 3 46.58 43.50 42.66 42.35 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
JPEG-XL libjxl Input: JPEG - Quality: 90 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: JPEG - Quality: 90 a c b d 10 20 30 40 50 SE +/- 0.39, N = 15 42.50 40.76 39.89 39.60 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
JPEG-XL libjxl Input: PNG - Quality: 100 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: PNG - Quality: 100 a b c d 7 14 21 28 35 SE +/- 0.01, N = 3 27.65 27.57 27.49 27.34 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
JPEG-XL libjxl Input: JPEG - Quality: 100 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: JPEG - Quality: 100 a b c d 6 12 18 24 30 SE +/- 0.03, N = 3 27.53 27.32 27.32 27.32 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
JPEG-XL Decoding libjxl CPU Threads: 1 OpenBenchmarking.org MP/s, More Is Better JPEG-XL Decoding libjxl 0.10.1 CPU Threads: 1 a b c d 14 28 42 56 70 SE +/- 0.12, N = 3 64.03 63.30 63.04 62.82
JPEG-XL Decoding libjxl CPU Threads: All OpenBenchmarking.org MP/s, More Is Better JPEG-XL Decoding libjxl 0.10.1 CPU Threads: All a b c d 100 200 300 400 500 SE +/- 1.28, N = 3 483.42 482.58 470.16 439.61
Stockfish Chess Benchmark OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 16.1 Chess Benchmark b c d a 13M 26M 39M 52M 65M SE +/- 1129127.95, N = 15 61008270 55237573 53082452 52607528 1. (CXX) g++ options: -lgcov -m64 -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -funroll-loops -msse -msse3 -mpopcnt -mavx2 -mbmi -msse4.1 -mssse3 -msse2 -mbmi2 -flto -flto-partition=one -flto=jobserver
RocksDB Test: Overwrite OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Overwrite a b c d 200K 400K 600K 800K 1000K SE +/- 4178.44, N = 3 781166 779236 777804 762175 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Random Fill OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Random Fill a b d c 200K 400K 600K 800K 1000K SE +/- 1329.24, N = 3 790603 783967 777115 774384 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Random Read OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Random Read a b d c 30M 60M 90M 120M 150M SE +/- 374266.35, N = 3 145962891 145893753 145516874 144921763 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Update Random OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Update Random a b c d 140K 280K 420K 560K 700K SE +/- 1458.06, N = 3 671121 666715 660978 658520 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Sequential Fill OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Sequential Fill a d c b 200K 400K 600K 800K 1000K SE +/- 4090.17, N = 3 920430 919148 908882 908177 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Random Fill Sync OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Random Fill Sync c d b a 10K 20K 30K 40K 50K SE +/- 32.60, N = 3 46701 46299 45974 45528 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Read While Writing OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Read While Writing d c b a 1.2M 2.4M 3.6M 4.8M 6M SE +/- 38492.46, N = 3 5633613 5592760 5586539 5546801 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Read Random Write Random OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Read Random Write Random a c b d 600K 1200K 1800K 2400K 3000K SE +/- 5471.11, N = 3 2879787 2868681 2859339 2814513 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.38.2 VGR Performance Metric a d b c 90K 180K 270K 360K 450K 430387 425778 424641 420528 1. (CXX) g++ options: -std=c++17 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -ltcl8.6 -lnetpbm -lregex_brl -lz_brl -lassimp -ldl -lm -ltk8.6
Chaos Group V-RAY Mode: CPU OpenBenchmarking.org vsamples, More Is Better Chaos Group V-RAY 6.0 Mode: CPU b d c a 10K 20K 30K 40K 50K SE +/- 195.74, N = 3 44634 44633 44375 44287
oneDNN Harness: IP Shapes 1D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: IP Shapes 1D - Engine: CPU b d a c 0.2987 0.5974 0.8961 1.1948 1.4935 SE +/- 0.00673, N = 3 1.31101 1.31333 1.31474 1.32743 MIN: 1.27 MIN: 1.27 MIN: 1.27 MIN: 1.26 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 3D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: IP Shapes 3D - Engine: CPU a b c d 0.7962 1.5924 2.3886 3.1848 3.981 SE +/- 0.00386, N = 3 3.52569 3.52584 3.53323 3.53851 MIN: 3.48 MIN: 3.48 MIN: 3.47 MIN: 3.49 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Convolution Batch Shapes Auto - Engine: CPU b c d a 0.6161 1.2322 1.8483 2.4644 3.0805 SE +/- 0.00421, N = 3 2.71135 2.73192 2.73352 2.73804 MIN: 2.65 MIN: 2.66 MIN: 2.67 MIN: 2.66 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Deconvolution Batch shapes_1d - Engine: CPU d b c a 1.2399 2.4798 3.7197 4.9596 6.1995 SE +/- 0.03672, N = 3 5.03641 5.33929 5.37367 5.51082 MIN: 3.91 MIN: 3.86 MIN: 3.84 MIN: 3.9 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Deconvolution Batch shapes_3d - Engine: CPU d b c a 0.5321 1.0642 1.5963 2.1284 2.6605 SE +/- 0.00819, N = 3 2.34346 2.34602 2.36138 2.36511 MIN: 2.25 MIN: 2.28 MIN: 2.25 MIN: 2.28 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Training - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Recurrent Neural Network Training - Engine: CPU d a b c 300 600 900 1200 1500 SE +/- 0.77, N = 3 1254.43 1254.68 1255.67 1256.43 MIN: 1249.27 MIN: 1250.31 MIN: 1250.99 MIN: 1249.44 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Recurrent Neural Network Inference - Engine: CPU b c a d 140 280 420 560 700 SE +/- 0.39, N = 3 636.77 637.17 638.13 642.19 MIN: 632.58 MIN: 632.47 MIN: 634.67 MIN: 633.4 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OSPRay Studio Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU b a d c 1000 2000 3000 4000 5000 SE +/- 4.41, N = 3 4521 4528 4528 4535
OSPRay Studio Camera: 2 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU b c a d 1000 2000 3000 4000 5000 SE +/- 7.22, N = 3 4609 4615 4616 4618
OSPRay Studio Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU a d c b 1100 2200 3300 4400 5500 SE +/- 4.04, N = 3 5330 5331 5341 5342
OSPRay Studio Camera: 1 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU b d c a 20K 40K 60K 80K 100K SE +/- 155.86, N = 3 76995 77073 77511 78061
OSPRay Studio Camera: 1 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU b a c d 30K 60K 90K 120K 150K SE +/- 64.70, N = 3 150040 150375 150381 150694
OSPRay Studio Camera: 2 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU b d a c 20K 40K 60K 80K 100K SE +/- 110.73, N = 3 78304 78354 78788 78919
OSPRay Studio Camera: 2 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU d c a b 30K 60K 90K 120K 150K SE +/- 88.33, N = 3 151620 152510 152653 153552
OSPRay Studio Camera: 3 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU c b a d 20K 40K 60K 80K 100K SE +/- 228.07, N = 3 90083 90131 90362 90380
OSPRay Studio Camera: 3 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU d b c a 40K 80K 120K 160K 200K SE +/- 268.88, N = 3 175129 175496 175767 177445
OSPRay Studio Camera: 1 - Resolution: 1080p - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 1080p - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU d a b c 200 400 600 800 1000 SE +/- 1.00, N = 3 1134 1138 1138 1138
OSPRay Studio Camera: 2 - Resolution: 1080p - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 1080p - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU b c d a 200 400 600 800 1000 SE +/- 0.67, N = 3 1150 1152 1153 1158
OSPRay Studio Camera: 3 - Resolution: 1080p - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 1080p - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU d c a b 300 600 900 1200 1500 SE +/- 4.41, N = 3 1330 1333 1336 1338
OSPRay Studio Camera: 1 - Resolution: 1080p - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 1080p - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU d b a c 4K 8K 12K 16K 20K SE +/- 39.50, N = 3 18177 18181 18203 18229
OSPRay Studio Camera: 1 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU d b a c 9K 18K 27K 36K 45K SE +/- 58.89, N = 3 41111 41125 41167 41189
OSPRay Studio Camera: 2 - Resolution: 1080p - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 1080p - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU b d a c 4K 8K 12K 16K 20K SE +/- 16.33, N = 3 18471 18488 18543 18579
OSPRay Studio Camera: 2 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU b a d c 9K 18K 27K 36K 45K SE +/- 250.02, N = 3 41598 41839 41959 42050
OSPRay Studio Camera: 3 - Resolution: 1080p - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 1080p - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPU d b c a 5K 10K 15K 20K 25K SE +/- 35.14, N = 3 21385 21398 21409 21449
OSPRay Studio Camera: 3 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU a d c b 10K 20K 30K 40K 50K SE +/- 129.36, N = 3 47495 47597 47686 47827
Google Draco Model: Lion OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Lion a d c b 1200 2400 3600 4800 6000 SE +/- 15.72, N = 3 5328 5347 5363 5365 1. (CXX) g++ options: -O3
Google Draco Model: Church Facade OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Church Facade a b c d 1500 3000 4500 6000 7500 SE +/- 9.91, N = 3 7023 7034 7092 7096 1. (CXX) g++ options: -O3
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection FP16 - Device: CPU b d a c 300 600 900 1200 1500 SE +/- 0.80, N = 3 1547.31 1553.54 1558.81 1563.15 MIN: 1403.59 / MAX: 1636.72 MIN: 1365.71 / MAX: 1635.16 MIN: 1416.22 / MAX: 1644.19 MIN: 1369.79 / MAX: 1663.37 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Detection FP16 - Device: CPU a b c d 40 80 120 160 200 SE +/- 0.11, N = 3 171.22 171.43 171.67 172.18 MIN: 130.32 / MAX: 233.99 MIN: 140.26 / MAX: 224.54 MIN: 132.19 / MAX: 231.16 MIN: 138.53 / MAX: 224.8 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Detection FP32 - Device: CPU b c d a 40 80 120 160 200 SE +/- 0.12, N = 3 171.16 171.29 171.54 171.92 MIN: 129.54 / MAX: 225.82 MIN: 134.25 / MAX: 225.4 MIN: 135.7 / MAX: 226.04 MIN: 129.51 / MAX: 227.57 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16 - Device: CPU d a b c 5 10 15 20 25 SE +/- 0.01, N = 3 19.84 19.94 20.04 20.21 MIN: 11.38 / MAX: 34.24 MIN: 8.87 / MAX: 42.42 MIN: 12.81 / MAX: 37.51 MIN: 9.28 / MAX: 49.09 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection FP16-INT8 - Device: CPU d b a c 150 300 450 600 750 SE +/- 0.26, N = 3 713.64 713.72 715.22 716.16 MIN: 667.56 / MAX: 731.59 MIN: 658.61 / MAX: 738.08 MIN: 664.62 / MAX: 729.04 MIN: 661.6 / MAX: 732.11 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16 - Device: CPU b c d a 1.2285 2.457 3.6855 4.914 6.1425 SE +/- 0.01, N = 3 5.45 5.45 5.45 5.46 MIN: 2.89 / MAX: 21.85 MIN: 2.8 / MAX: 22.36 MIN: 2.8 / MAX: 28.34 MIN: 2.82 / MAX: 21.64 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16 - Device: CPU b a c d 16 32 48 64 80 SE +/- 0.02, N = 3 70.23 70.48 70.66 70.76 MIN: 43.23 / MAX: 122.75 MIN: 43.7 / MAX: 128.66 MIN: 24.84 / MAX: 127.05 MIN: 41.53 / MAX: 122.26 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16-INT8 - Device: CPU c b a d 3 6 9 12 15 SE +/- 0.02, N = 3 10.82 10.83 10.86 10.86 MIN: 5.79 / MAX: 27.55 MIN: 6.67 / MAX: 24.71 MIN: 7.32 / MAX: 25.15 MIN: 6.39 / MAX: 25.76 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16 - Device: CPU b a d c 4 8 12 16 20 SE +/- 0.01, N = 3 16.65 16.69 16.72 16.76 MIN: 10 / MAX: 33.75 MIN: 13.14 / MAX: 33.56 MIN: 9.04 / MAX: 25.27 MIN: 8.74 / MAX: 34.17 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16-INT8 - Device: CPU b a c d 0.8348 1.6696 2.5044 3.3392 4.174 SE +/- 0.01, N = 3 3.69 3.71 3.71 3.71 MIN: 2.28 / MAX: 15.95 MIN: 2.14 / MAX: 26.73 MIN: 2.23 / MAX: 18 MIN: 2.38 / MAX: 15.94 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU d a b c 7 14 21 28 35 SE +/- 0.02, N = 3 27.98 27.99 28.00 28.08 MIN: 14.99 / MAX: 46.26 MIN: 18.95 / MAX: 38.76 MIN: 18.81 / MAX: 38.86 MIN: 14.29 / MAX: 41.57 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Machine Translation EN To DE FP16 - Device: CPU b d a c 30 60 90 120 150 SE +/- 0.06, N = 3 135.70 135.96 136.00 136.15 MIN: 109.99 / MAX: 155.71 MIN: 110.6 / MAX: 155.59 MIN: 118.61 / MAX: 153.61 MIN: 73.11 / MAX: 161.85 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16-INT8 - Device: CPU b d a c 4 8 12 16 20 SE +/- 0.01, N = 3 14.49 14.51 14.52 14.56 MIN: 8.36 / MAX: 28.02 MIN: 8.58 / MAX: 29.8 MIN: 8.14 / MAX: 28.07 MIN: 8.28 / MAX: 27.62 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Vehicle Bike Detection FP16 - Device: CPU a c d b 3 6 9 12 15 SE +/- 0.05, N = 3 13.38 13.39 13.40 13.47 MIN: 7.23 / MAX: 35.43 MIN: 7.1 / MAX: 31.9 MIN: 8.15 / MAX: 34.65 MIN: 7.33 / MAX: 31.02 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Noise Suppression Poconet-Like FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Noise Suppression Poconet-Like FP16 - Device: CPU b d c a 3 6 9 12 15 SE +/- 0.03, N = 3 11.41 11.44 11.46 11.54 MIN: 7.86 / MAX: 31.96 MIN: 9.06 / MAX: 31.81 MIN: 6.19 / MAX: 32.08 MIN: 6.61 / MAX: 30.79 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16 - Device: CPU b c a d 14 28 42 56 70 SE +/- 0.11, N = 3 63.08 63.40 63.46 63.54 MIN: 42.84 / MAX: 83.6 MIN: 37.56 / MAX: 90.19 MIN: 45.07 / MAX: 85.2 MIN: 41.89 / MAX: 89.71 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Re-Identification Retail FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Re-Identification Retail FP16 - Device: CPU c d b a 3 6 9 12 15 SE +/- 0.01, N = 3 9.97 9.97 9.98 10.00 MIN: 5.91 / MAX: 41.34 MIN: 5.63 / MAX: 25.55 MIN: 5.6 / MAX: 24.1 MIN: 6.87 / MAX: 16.1 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU a b d c 0.2205 0.441 0.6615 0.882 1.1025 SE +/- 0.00, N = 3 0.97 0.97 0.97 0.98 MIN: 0.66 / MAX: 15.15 MIN: 0.57 / MAX: 14.08 MIN: 0.55 / MAX: 13.67 MIN: 0.53 / MAX: 16.95 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16-INT8 - Device: CPU b c a d 13 26 39 52 65 SE +/- 0.01, N = 3 56.30 56.64 56.87 56.90 MIN: 52.17 / MAX: 69.13 MIN: 36.21 / MAX: 73.18 MIN: 35.77 / MAX: 72.74 MIN: 51.72 / MAX: 72.17 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU a b c d 0.1193 0.2386 0.3579 0.4772 0.5965 SE +/- 0.00, N = 3 0.53 0.53 0.53 0.53 MIN: 0.3 / MAX: 12.95 MIN: 0.3 / MAX: 12.53 MIN: 0.3 / MAX: 13.83 MIN: 0.31 / MAX: 12.66 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream b a d c 100 200 300 400 500 SE +/- 0.38, N = 3 447.65 448.71 448.93 449.12
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream b c d a 12 24 36 48 60 SE +/- 0.05, N = 3 54.01 54.33 54.34 54.42
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream d b a c 4 8 12 16 20 SE +/- 0.01, N = 3 17.48 17.49 17.51 17.52
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream d a c b 1.1726 2.3452 3.5178 4.6904 5.863 SE +/- 0.0247, N = 3 5.1408 5.1435 5.1787 5.2115
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream b a d c 9 18 27 36 45 SE +/- 0.00, N = 3 39.05 39.05 39.06 39.07
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream a d c b 2 4 6 8 10 SE +/- 0.0148, N = 3 6.3927 6.4001 6.4155 6.4286
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream b d c a 2 4 6 8 10 SE +/- 0.0186, N = 3 5.8898 5.9417 5.9888 6.0030
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream b c a d 0.2826 0.5652 0.8478 1.1304 1.413 SE +/- 0.0030, N = 3 1.2468 1.2478 1.2491 1.2559
Neural Magic DeepSparse Model: Llama2 Chat 7b Quantized - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: Llama2 Chat 7b Quantized - Scenario: Asynchronous Multi-Stream a c d b 1300 2600 3900 5200 6500 SE +/- 9.69, N = 3 5873.27 5896.59 5897.28 5904.28
Neural Magic DeepSparse Model: Llama2 Chat 7b Quantized - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: Llama2 Chat 7b Quantized - Scenario: Synchronous Single-Stream a c b d 40 80 120 160 200 SE +/- 0.11, N = 3 165.40 165.47 165.55 165.61
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream b a c d 9 18 27 36 45 SE +/- 0.00, N = 3 39.03 39.05 39.06 39.07
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream a d c b 2 4 6 8 10 SE +/- 0.0061, N = 3 6.3846 6.4059 6.4103 6.4293
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream b c a d 20 40 60 80 100 SE +/- 0.16, N = 3 78.82 79.25 79.45 79.51
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream a b d c 3 6 9 12 15 SE +/- 0.01, N = 3 10.06 10.07 10.09 10.12
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream b a d c 12 24 36 48 60 SE +/- 0.05, N = 3 53.47 53.48 53.63 53.76
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream a c b d 3 6 9 12 15 SE +/- 0.0057, N = 3 8.8921 8.9189 8.9317 8.9625
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream b d a c 90 180 270 360 450 SE +/- 0.32, N = 3 393.08 393.39 393.66 393.82
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream b d a c 10 20 30 40 50 SE +/- 0.03, N = 3 45.96 46.02 46.06 46.08
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream b c d a 8 16 24 32 40 SE +/- 0.06, N = 3 35.54 35.81 35.81 35.85
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream b c a d 3 6 9 12 15 SE +/- 0.02, N = 3 13.13 13.14 13.17 13.20
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream b a d c 100 200 300 400 500 SE +/- 0.66, N = 3 446.24 446.84 447.17 447.58
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream b a c d 12 24 36 48 60 SE +/- 0.02, N = 3 53.97 54.07 54.08 54.10
Timed Linux Kernel Compilation Build: defconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.8 Build: defconfig c a d b 12 24 36 48 60 SE +/- 0.59, N = 3 52.81 54.13 54.16 54.21
Timed Linux Kernel Compilation Build: allmodconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.8 Build: allmodconfig c a d b 130 260 390 520 650 SE +/- 0.92, N = 3 596.51 597.03 597.83 597.90
Parallel BZIP2 Compression FreeBSD-13.0-RELEASE-amd64-memstick.img Compression OpenBenchmarking.org Seconds, Fewer Is Better Parallel BZIP2 Compression 1.1.13 FreeBSD-13.0-RELEASE-amd64-memstick.img Compression d a c b 0.7458 1.4916 2.2374 2.9832 3.729 SE +/- 0.043942, N = 12 3.221993 3.230305 3.285082 3.314749 1. (CXX) g++ options: -O2 -pthread -lbz2 -lpthread
Primesieve Length: 1e12 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.1 Length: 1e12 b c d a 2 4 6 8 10 SE +/- 0.056, N = 3 6.081 6.113 6.123 6.244 1. (CXX) g++ options: -O3
Primesieve Length: 1e13 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.1 Length: 1e13 b d a c 20 40 60 80 100 SE +/- 0.05, N = 3 76.92 77.06 77.15 77.21 1. (CXX) g++ options: -O3
WavPack Audio Encoding WAV To WavPack OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.7 WAV To WavPack a c b d 0.9995 1.999 2.9985 3.998 4.9975 SE +/- 0.000, N = 5 4.433 4.435 4.438 4.442
Phoronix Test Suite v10.8.5