7773x Tests for a future article. 2 x AMD EPYC 7573X 32-Core testing with a AMD DAYTONA_X (RYM1009B BIOS) and ASPEED on Ubuntu 22.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2305044-NE-7773X849132&grs&rdt .
7773x Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Vulkan Compiler File-System Screen Resolution a b 5 a 5 b 5 2p a 5 2p b AMD EPYC 7773X 64-Core @ 2.20GHz (64 Cores / 128 Threads) AMD DAYTONA_X (RYM1009B BIOS) AMD Starship/Matisse 256GB 3841GB Micron_9300_MTFDHAL3T8TDP ASPEED VE228 2 x Mellanox MT27710 Ubuntu 22.04 5.15.0-47-generic (x86_64) GNOME Shell 42.4 X Server 1.21.1.3 1.2.204 GCC 11.2.0 ext4 1920x1080 AMD EPYC 7573X 32-Core @ 2.80GHz (32 Cores / 64 Threads) 2 x AMD EPYC 7573X 32-Core @ 2.80GHz (64 Cores / 128 Threads) 512GB OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa001229 Python Details - Python 3.10.6 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
7773x ncnn: CPU - mobilenet ncnn: CPU - mnasnet ncnn: CPU - resnet50 ncnn: CPU - googlenet ncnn: CPU - efficientnet-b0 lczero: Eigen ncnn: CPU - blazeface ncnn: CPU - resnet18 ncnn: CPU-v2-v2 - mobilenet-v2 lczero: BLAS ncnn: CPU - FastestDet ncnn: CPU - alexnet ncnn: CPU - shufflenet-v2 ncnn: CPU - regnety_400m openfoam: drivaerFastback, Medium Mesh Size - Mesh Time ncnn: CPU-v3-v3 - mobilenet-v3 opencv: Core opencv: Object Detection ncnn: CPU - squeezenet_ssd onednn: Deconvolution Batch shapes_1d - f32 - CPU petsc: Streams ncnn: CPU - yolov4-tiny onednn: IP Shapes 1D - f32 - CPU opencv: DNN - Deep Neural Network askap: tConvolve MPI - Gridding incompact3d: input.i3d 129 Cells Per Direction lulesh: opencv: Graph API specfem3d: Mount St. Helens openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openfoam: drivaerFastback, Medium Mesh Size - Execution Time askap: Hogbom Clean OpenMP askap: tConvolve MPI - Degridding onednn: IP Shapes 3D - f32 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU john-the-ripper: WPA PSK john-the-ripper: bcrypt john-the-ripper: Blowfish specfem3d: Water-layered Halfspace specfem3d: Layered Halfspace openvino: Face Detection FP16-INT8 - CPU openvino: Person Detection FP32 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU blender: Classroom - CPU-Only openvino: Face Detection FP16 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU blender: Barbershop - CPU-Only openvino: Machine Translation EN To DE FP16 - CPU blender: Pabellon Barcelona - CPU-Only john-the-ripper: MD5 blender: BMW27 - CPU-Only blender: Fishy Cat - CPU-Only john-the-ripper: HMAC-SHA512 compress-7zip: Decompression Rating pennant: leblancbig embree: Pathtracer - Crown incompact3d: input.i3d 193 Cells Per Direction pennant: sedovbig embree: Pathtracer ISPC - Crown mt-dgemm: Sustained Floating-Point Rate specfem3d: Tomographic Model specfem3d: Homogeneous Halfspace onednn: Convolution Batch Shapes Auto - f32 - CPU askap: tConvolve OpenMP - Gridding onednn: Deconvolution Batch shapes_3d - f32 - CPU build-llvm: Ninja gromacs: MPI CPU - water_GMX50_bare embree: Pathtracer ISPC - Asian Dragon Obj embree: Pathtracer ISPC - Asian Dragon embree: Pathtracer - Asian Dragon embree: Pathtracer - Asian Dragon Obj ncnn: CPU - vgg16 openvino: Age Gender Recognition Retail 0013 FP16 - CPU opencv: Stitching openfoam: drivaerFastback, Small Mesh Size - Execution Time askap: tConvolve MT - Degridding compress-7zip: Compression Rating openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU askap: tConvolve MT - Gridding openvino: Vehicle Detection FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvkl: vklBenchmark ISPC cloverleaf: Lagrangian-Eulerian Hydrodynamics openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Person Detection FP32 - CPU openvino: Person Detection FP16 - CPU build-llvm: Unix Makefiles openvino: Face Detection FP16 - CPU build-ffmpeg: Time To Compile svt-av1: Preset 12 - Bosphorus 4K openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU vvenc: Bosphorus 1080p - Faster vvenc: Bosphorus 4K - Faster askap: tConvolve OpenMP - Degridding ffmpeg: libx265 - Live ffmpeg: libx265 - Live svt-av1: Preset 13 - Bosphorus 4K onednn: Recurrent Neural Network Training - f32 - CPU svt-av1: Preset 8 - Bosphorus 4K vvenc: Bosphorus 1080p - Fast svt-av1: Preset 12 - Bosphorus 1080p vvenc: Bosphorus 4K - Fast ncnn: CPU - vision_transformer compress-zstd: 3, Long Mode - Compression Speed openfoam: drivaerFastback, Small Mesh Size - Mesh Time compress-zstd: 8, Long Mode - Compression Speed svt-av1: Preset 13 - Bosphorus 1080p compress-zstd: 8 - Compression Speed ffmpeg: libx265 - Video On Demand ffmpeg: libx265 - Video On Demand ffmpeg: libx265 - Platform ffmpeg: libx265 - Platform compress-zstd: 12 - Compression Speed clickhouse: 100M Rows Hits Dataset, Second Run ffmpeg: libx265 - Upload ffmpeg: libx265 - Upload quantlib: svt-av1: Preset 4 - Bosphorus 1080p svt-av1: Preset 8 - Bosphorus 1080p clickhouse: 100M Rows Hits Dataset, Third Run compress-zstd: 3, Long Mode - Decompression Speed clickhouse: 100M Rows Hits Dataset, First Run / Cold Cache compress-zstd: 3 - Decompression Speed compress-zstd: 8 - Decompression Speed compress-zstd: 12 - Decompression Speed compress-zstd: 19 - Compression Speed compress-zstd: 19, Long Mode - Decompression Speed draco: Lion compress-zstd: 19 - Decompression Speed compress-zstd: 3 - Compression Speed compress-zstd: 19, Long Mode - Compression Speed draco: Church Facade svt-av1: Preset 4 - Bosphorus 4K compress-zstd: 8, Long Mode - Decompression Speed espeak: Text-To-Speech Synthesis a b 5 a 5 b 5 2p a 5 2p b 25.348583 11.678414616 40.481652 201388 86230 86460 28.081540950 30.518912661 72.45 260.83 88.76 5534000 27.61 34.77 136796000 69.1077 63.3711 29.383496 13.862202300 17.209349082 165.707 7.290 64.5117 73.8446 78.4751 70.5238 40.450568 470 246.957 16.928 222.837 30.006 11.367 105.78 47.74 195.924 73.442 16.552 604.842 6.286 24.729957 552.241 173.47 43.67 43.73 173.23 21.59 116.93 2746.5 10.295 109.984 4867 5793 4.863 17.84 11.71 19.58 19.09 15.44 5142 8.12 10.89 11.3 5554 19.08 7.13 15.6 53.52 120.39337 10.29 77061 31910 19.32 6.92145 56185.7504 23.92 1.26107 39756 36827.5 4.43804121 22257.137 235810 11.75332146 1888.2 1889.05 363.7359 806.452 32799.5 1.24618 2674.29 202957 91276 87360 27.467540273 30.115899704 27.03 8.04 1205.11 8.05 1198.45 72.02 12.16 749.622 259.87 138.84 88.73 5543000 27.6 34.61 135165000 411213 5.104078 17.2118473 9.56361 63.3647 29.04569 13.530179394 16.817080246 0.827622 20481.2 3.12487 164.449 7.299 64.6409 73.828 24.04 35737.82 201160 40.244428 8308.33 396928 38091.46 5493.35 26.54 1.62 469 11.34 1.74 26.68 3927 3911.97 248.481 2608.34 17.042 214.933 16.93 230.27 16.93 1175.01 23.92 30.114 11.364 19018.3 106.98 47.206774889 198.135 1183.63 67.548 16.374 601.984 6.128 133.39 752.6 25.596085 744.5 549.834 988.5 172.533939506 43.90 43.90 172.56956694 270.8 439.07 21.62 116.80915814 2692.4 10.234 109.711 437.35 1300.8 428.51 1277.3 1390 1445.9 17.6 1190.8 4885 1276.8 2819 9.3 5833 4.826 1434.2 28.157 13.7 6.08 14.8 14.11 9.03 1238 3.82 8.12 6.46 1419 9.25 5.48 7.73 26.34 117.4378 6.19 68343 27739 14.85 4.49565 31992.6068 19.98 1.37996 39997 24990.1 5.06756401 20920.487 206970 18.86219907 1252.05 1244.86 428.6592 869.565 20991.7 0.62593 1767.13 130848 60192 60325 45.70942713 49.136149469 17.92 5.47 841.49 5.48 822.19 113.31 8.23 679.645 408.79 92.61 136.58 3621000 42.52 52.68 96059000 244220 7.974022 43.9492 19.6305008 14.00111 40.2408 17.8745 19.089664857 24.987445867 1.14656 17750.4 2.76234 230.378 5.022 39.1941 44.9118 48.5228 43.9589 21.08 25209.62 184032 50.745904 9611.05 271918 27340.96 6599.68 19 1.16 340 12.09 1.26 19.45 2886.12 2881.04 290.539 1935.6 20.378 220.813 12.77 172.53 12.84 886.41 18.1 31.837 11.961 19018.3 110.49 45.71 199.69 1182.61 65.337 16.846 629.3 6.406 126.26 854.2 22.337396 792.1 571.084 1067.8 161.54 46.89 46.70 162.19 289.1 457.81 22.95 110.01 2756.9 10.76 107.382 456.30 1361.5 437.20 1326.5 1438.8 1478.1 18.2 1229.1 4760 1304.2 2856.6 9.54 5718 4.743 1460.1 27.826 117.6045 19.355553092 427.71276 131072 60152 60345 44.334574335 48.738676454 3605000 94924000 43.9787 40.1199 18.6403 19.630096342 25.089622797 231.091 39.2236 45.1054 48.5074 44.1884 50.699752 339 297.223 20.64 226.733 31.795 12.02 110.29 45.79 196.063 70.856 17.253 622.732 6.39 22.706399 560.565 22.93 110.14 2820.4 10.828 108.364 4.76 77.77 38.49 80.81 93.29 35.12 8286 14.65 38.94 27.7 8284 40.45 18.6 33.27 100.65 99.384145 21.26 37.17 10.5221 38.1 2.67801 49980.1 2.48654294 42030.038 9.609453868 2523.57 2504.36 205.46392 436.681 41983.3 0.911806 3501.69 259413 118502 118579 23.936065997 25.221006745 34.83 10.64 1632.81 10.54 1594.14 58.55 15.86 1212.84 213.6 175.48 71.64 6845000 22.59 27.96 79002000 453996 4.294731 81.4508 10.6241302 7.656369 73.3301 31.344515 11.224767813 14.491419469 0.674336 12102.5 1.85167 138.717 8.285 64.4091 73.9612 76.8067 68.6922 33.55 39455.86 33.001079 12506.7 403739 7738.59 19.58 452 15.67 1.37 20.06 2946.97 2975.15 219.403 2013.52 15.455 169.112 12.66 182.1 12.76 914.24 18.26 24.492 9.574 16641 92.52 54.58 169.658 1390.36 62.836 14.812 540.729 5.548 144.31 808.5 22.563946 702.8 525.154 1024 168.456941186 44.97 44.31 170.942711463 283.6 463.68 22.38 112.819760537 2760.9 10.41 107.1 461.53 1332.3 445.18 1306.7 1437.5 1482.7 18 1222.9 4734 1315.6 2783.7 9.57 5682 4.784 1454.7 27.868 115.68 49.46 120.27 112.43 60.79 7816 24.56 51.8 38.67 7991 53.25 28.55 37.89 128.35 99.93268 28.88 236445 88553 35.81 10.3685 74130.1789 45.89 2.88451 87133 51199.1 2.48386908 42551.898 419702 9.59476935 2524.05 2498.74 203.65118 434.783 41983.3 0.917226 3504.15 258457 117542 117771 23.268032641 25.472190291 34.91 10.64 1623.83 10.63 1592.71 58.57 15.76 1305.96 213.46 176.76 71.73 6683000 22.51 27.9 73260000 440740 4.418012 81.0913 10.7624416 7.793325 73.2602 31.760876 11.797525585 14.415931158 0.669349 12678.9 1.89859 138.565 8.222 64.0441 73.7779 76.6287 68.7022 32.81 38800.34 287492 32.974316 11068.8 396844 40352.48 7176.41 19.69 1.33 452 15.51 1.4 20.07 2952.76 2948.39 221.789 2020.21 15.344 176.107 12.66 180.91 12.79 912.42 18.25 24.268 9.906 15662.1 107.29 47.07 172.573 1385.33 69.536 15.044 565.057 5.768 145.49 741.5 23.335275 702.2 546.034 1013.4 173.052257283 43.77 43.64 173.56002958 282.8 467.46 22.10 114.261775562 2850.8 10.495 104.169 448.03 1336.5 440.56 1309.5 1441.1 1498.9 18 1227.1 4818 1312 2775.9 9.5 5729 4.842 1445.1 28.137 OpenBenchmarking.org
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: mobilenet b 5 a 5 2p a 5 2p b 30 60 90 120 150 17.84 13.70 77.77 115.68 MIN: 17.55 / MAX: 25.78 MIN: 13.55 / MAX: 14.38 MIN: 67.11 / MAX: 156 MIN: 64.45 / MAX: 159.93 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: mnasnet b 5 a 5 2p a 5 2p b 11 22 33 44 55 11.71 6.08 38.49 49.46 MIN: 9.43 / MAX: 20.81 MIN: 6.01 / MAX: 6.54 MIN: 25.51 / MAX: 75.24 MIN: 36.42 / MAX: 176.46 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: resnet50 b 5 a 5 2p a 5 2p b 30 60 90 120 150 19.58 14.80 80.81 120.27 MIN: 19.15 / MAX: 44.79 MIN: 14.6 / MAX: 16.84 MIN: 62.51 / MAX: 112.38 MIN: 42.99 / MAX: 192.41 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: googlenet b 5 a 5 2p a 5 2p b 30 60 90 120 150 19.09 14.11 93.29 112.43 MIN: 18.77 / MAX: 25.76 MIN: 13.96 / MAX: 17.22 MIN: 52.46 / MAX: 137.19 MIN: 29.99 / MAX: 148.59 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: efficientnet-b0 b 5 a 5 2p a 5 2p b 14 28 42 56 70 15.44 9.03 35.12 60.79 MIN: 13.63 / MAX: 18.67 MIN: 8.93 / MAX: 11.07 MIN: 33.83 / MAX: 41.96 MIN: 47.67 / MAX: 141.85 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
LeelaChessZero Backend: Eigen OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: Eigen b 5 a 5 2p a 5 2p b 2K 4K 6K 8K 10K 5142 1238 8286 7816 1. (CXX) g++ options: -flto -pthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: blazeface b 5 a 5 2p a 5 2p b 6 12 18 24 30 8.12 3.82 14.65 24.56 MIN: 6.94 / MAX: 11.1 MIN: 3.45 / MAX: 79.74 MIN: 11.34 / MAX: 73.36 MIN: 20.64 / MAX: 91.2 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: resnet18 b 5 a 5 2p a 5 2p b 12 24 36 48 60 10.89 8.12 38.94 51.80 MIN: 10.68 / MAX: 11.74 MIN: 8.01 / MAX: 10.04 MIN: 16.05 / MAX: 124.98 MIN: 16.09 / MAX: 93.56 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU-v2-v2 - Model: mobilenet-v2 b 5 a 5 2p a 5 2p b 9 18 27 36 45 11.30 6.46 27.70 38.67 MIN: 9.74 / MAX: 14.96 MIN: 6.35 / MAX: 8.73 MIN: 23.18 / MAX: 43.54 MIN: 28.02 / MAX: 119.64 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
LeelaChessZero Backend: BLAS OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: BLAS b 5 a 5 2p a 5 2p b 2K 4K 6K 8K 10K 5554 1419 8284 7991 1. (CXX) g++ options: -flto -pthread
NCNN Target: CPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: FastestDet b 5 a 5 2p a 5 2p b 12 24 36 48 60 19.08 9.25 40.45 53.25 MIN: 13.65 / MAX: 21.7 MIN: 9.12 / MAX: 9.81 MIN: 27.4 / MAX: 462.07 MIN: 30.07 / MAX: 66.77 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: alexnet b 5 a 5 2p a 5 2p b 7 14 21 28 35 7.13 5.48 18.60 28.55 MIN: 6.97 / MAX: 7.76 MIN: 5.36 / MAX: 6.34 MIN: 17.59 / MAX: 34.64 MIN: 12.6 / MAX: 62.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: shufflenet-v2 b 5 a 5 2p a 5 2p b 9 18 27 36 45 15.60 7.73 33.27 37.89 MIN: 12.95 / MAX: 19.73 MIN: 7.58 / MAX: 9.71 MIN: 29.37 / MAX: 96.69 MIN: 34.63 / MAX: 113.32 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: regnety_400m b 5 a 5 2p a 5 2p b 30 60 90 120 150 53.52 26.34 100.65 128.35 MIN: 50.93 / MAX: 71.06 MIN: 25.99 / MAX: 28.26 MIN: 97.81 / MAX: 136.12 MIN: 111.43 / MAX: 240.27 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenFOAM Input: drivaerFastback, Medium Mesh Size - Mesh Time OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Medium Mesh Size - Mesh Time a b 5 a 5 b 5 2p a 5 2p b 30 60 90 120 150 25.35 120.39 117.44 117.60 99.38 99.93 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU-v3-v3 - Model: mobilenet-v3 b 5 a 5 2p a 5 2p b 7 14 21 28 35 10.29 6.19 21.26 28.88 MIN: 9.51 / MAX: 11.93 MIN: 6.05 / MAX: 6.93 MIN: 20.79 / MAX: 28.25 MIN: 23.66 / MAX: 168.43 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenCV Test: Core OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Core b 5 a 5 2p b 50K 100K 150K 200K 250K 77061 68343 236445 1. (CXX) g++ options: -fPIC -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -shared
OpenCV Test: Object Detection OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Object Detection b 5 a 5 2p b 20K 40K 60K 80K 100K 31910 27739 88553 1. (CXX) g++ options: -fPIC -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -shared
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: squeezenet_ssd b 5 a 5 2p a 5 2p b 9 18 27 36 45 19.32 14.85 37.17 35.81 MIN: 18.86 / MAX: 22.39 MIN: 14.49 / MAX: 25.32 MIN: 30.62 / MAX: 51.54 MIN: 29.14 / MAX: 58.45 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU b 5 a 5 2p a 5 2p b 3 6 9 12 15 6.92145 4.49565 10.52210 10.36850 MIN: 6.32 MIN: 3.9 MIN: 8.88 MIN: 8.54 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
PETSc Test: Streams OpenBenchmarking.org MB/s, More Is Better PETSc 3.19 Test: Streams b 5 a 5 2p b 16K 32K 48K 64K 80K 56185.75 31992.61 74130.18 1. (CC) gcc options: -fPIC -O3 -O2 -lpthread -ludev -lpciaccess -lm
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: yolov4-tiny b 5 a 5 2p a 5 2p b 10 20 30 40 50 23.92 19.98 38.10 45.89 MIN: 23.17 / MAX: 30.48 MIN: 19.53 / MAX: 23.19 MIN: 29.08 / MAX: 101.85 MIN: 34.08 / MAX: 64.66 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU b 5 a 5 2p a 5 2p b 0.649 1.298 1.947 2.596 3.245 1.26107 1.37996 2.67801 2.88451 MIN: 1.06 MIN: 1.25 MIN: 1.89 MIN: 1.79 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenCV Test: DNN - Deep Neural Network OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: DNN - Deep Neural Network b 5 a 5 2p b 20K 40K 60K 80K 100K 39756 39997 87133 1. (CXX) g++ options: -fPIC -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -shared
ASKAP Test: tConvolve MPI - Gridding OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Gridding b 5 a 5 2p a 5 2p b 11K 22K 33K 44K 55K 36827.5 24990.1 49980.1 51199.1 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
Xcompact3d Incompact3d Input: input.i3d 129 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 129 Cells Per Direction b 5 a 5 2p a 5 2p b 1.1402 2.2804 3.4206 4.5608 5.701 4.43804121 5.06756401 2.48654294 2.48386908 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
LULESH OpenBenchmarking.org z/s, More Is Better LULESH 2.0.3 b 5 a 5 2p a 5 2p b 9K 18K 27K 36K 45K 22257.14 20920.49 42030.04 42551.90 1. (CXX) g++ options: -O3 -fopenmp -lm -lmpi_cxx -lmpi
OpenCV Test: Graph API OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Graph API b 5 a 5 2p b 90K 180K 270K 360K 450K 235810 206970 419702 1. (CXX) g++ options: -fPIC -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -shared
SPECFEM3D Model: Mount St. Helens OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Mount St. Helens a b 5 a 5 b 5 2p a 5 2p b 5 10 15 20 25 SE +/- 0.038048290, N = 3 11.678414616 11.753321460 18.862199070 19.355553092 9.609453868 9.594769350 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Vehicle Bike Detection FP16 - Device: CPU b 5 a 5 2p a 5 2p b 500 1000 1500 2000 2500 1888.20 1252.05 2523.57 2524.05 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16-INT8 - Device: CPU b 5 a 5 2p a 5 2p b 500 1000 1500 2000 2500 1889.05 1244.86 2504.36 2498.74 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenFOAM Input: drivaerFastback, Medium Mesh Size - Execution Time OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Medium Mesh Size - Execution Time a b 5 a 5 b 5 2p a 5 2p b 90 180 270 360 450 40.48 363.74 428.66 427.71 205.46 203.65 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm
ASKAP Test: Hogbom Clean OpenMP OpenBenchmarking.org Iterations Per Second, More Is Better ASKAP 1.0 Test: Hogbom Clean OpenMP b 5 a 5 2p a 5 2p b 200 400 600 800 1000 806.45 869.57 436.68 434.78 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASKAP Test: tConvolve MPI - Degridding OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Degridding b 5 a 5 2p a 5 2p b 9K 18K 27K 36K 45K 32799.5 20991.7 41983.3 41983.3 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU b 5 a 5 2p a 5 2p b 0.2804 0.5608 0.8412 1.1216 1.402 1.246180 0.625930 0.911806 0.917226 MIN: 1.13 MIN: 0.57 MIN: 0.78 MIN: 0.77 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16-INT8 - Device: CPU b 5 a 5 2p a 5 2p b 800 1600 2400 3200 4000 2674.29 1767.13 3501.69 3504.15 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
John The Ripper Test: WPA PSK OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: WPA PSK a b 5 a 5 b 5 2p a 5 2p b 60K 120K 180K 240K 300K SE +/- 297.59, N = 3 201388 202957 130848 131072 259413 258457 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt -lbz2
John The Ripper Test: bcrypt OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: bcrypt a b 5 a 5 b 5 2p a 5 2p b 30K 60K 90K 120K 150K SE +/- 99.80, N = 3 86230 91276 60192 60152 118502 117542 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt -lbz2
John The Ripper Test: Blowfish OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: Blowfish a b 5 a 5 b 5 2p a 5 2p b 30K 60K 90K 120K 150K SE +/- 185.66, N = 3 86460 87360 60325 60345 118579 117771 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt -lbz2
SPECFEM3D Model: Water-layered Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Water-layered Halfspace a b 5 a 5 b 5 2p a 5 2p b 10 20 30 40 50 SE +/- 0.13, N = 3 28.08 27.47 45.71 44.33 23.94 23.27 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
SPECFEM3D Model: Layered Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Layered Halfspace a b 5 a 5 b 5 2p a 5 2p b 11 22 33 44 55 SE +/- 0.29, N = 3 30.52 30.12 49.14 48.74 25.22 25.47 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Face Detection FP16-INT8 - Device: CPU b 5 a 5 2p a 5 2p b 8 16 24 32 40 27.03 17.92 34.83 34.91 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Detection FP32 - Device: CPU b 5 a 5 2p a 5 2p b 3 6 9 12 15 8.04 5.47 10.64 10.64 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16 - Device: CPU b 5 a 5 2p a 5 2p b 400 800 1200 1600 2000 1205.11 841.49 1632.81 1623.83 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Detection FP16 - Device: CPU b 5 a 5 2p a 5 2p b 3 6 9 12 15 8.05 5.48 10.54 10.63 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16 - Device: CPU b 5 a 5 2p a 5 2p b 300 600 900 1200 1500 1198.45 822.19 1594.14 1592.71 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.5 Blend File: Classroom - Compute: CPU-Only a b 5 a 5 2p a 5 2p b 30 60 90 120 150 SE +/- 0.17, N = 3 72.45 72.02 113.31 58.55 58.57
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Face Detection FP16 - Device: CPU b 5 a 5 2p a 5 2p b 4 8 12 16 20 12.16 8.23 15.86 15.76 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU b 5 a 5 2p a 5 2p b 300 600 900 1200 1500 749.62 679.65 1212.84 1305.96 MIN: 731.75 MIN: 664.64 MIN: 1070.52 MIN: 1060.61 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.5 Blend File: Barbershop - Compute: CPU-Only a b 5 a 5 2p a 5 2p b 90 180 270 360 450 SE +/- 0.09, N = 3 260.83 259.87 408.79 213.60 213.46
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Machine Translation EN To DE FP16 - Device: CPU b 5 a 5 2p a 5 2p b 40 80 120 160 200 138.84 92.61 175.48 176.76 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
Blender Blend File: Pabellon Barcelona - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.5 Blend File: Pabellon Barcelona - Compute: CPU-Only a b 5 a 5 2p a 5 2p b 30 60 90 120 150 SE +/- 0.10, N = 3 88.76 88.73 136.58 71.64 71.73
John The Ripper Test: MD5 OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: MD5 a b 5 a 5 b 5 2p a 5 2p b 1.5M 3M 4.5M 6M 7.5M SE +/- 1000.00, N = 3 5534000 5543000 3621000 3605000 6845000 6683000 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt -lbz2
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.5 Blend File: BMW27 - Compute: CPU-Only a b 5 a 5 2p a 5 2p b 10 20 30 40 50 SE +/- 0.03, N = 3 27.61 27.60 42.52 22.59 22.51
Blender Blend File: Fishy Cat - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.5 Blend File: Fishy Cat - Compute: CPU-Only a b 5 a 5 2p a 5 2p b 12 24 36 48 60 SE +/- 0.04, N = 3 34.77 34.61 52.68 27.96 27.90
John The Ripper Test: HMAC-SHA512 OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: HMAC-SHA512 a b 5 a 5 b 5 2p a 5 2p b 30M 60M 90M 120M 150M SE +/- 266743.32, N = 3 136796000 135165000 96059000 94924000 79002000 73260000 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt -lbz2
7-Zip Compression Test: Decompression Rating OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 22.01 Test: Decompression Rating b 5 a 5 2p a 5 2p b 100K 200K 300K 400K 500K 411213 244220 453996 440740 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
Pennant Test: leblancbig OpenBenchmarking.org Hydro Cycle Time - Seconds, Fewer Is Better Pennant 1.0.1 Test: leblancbig b 5 a 5 2p a 5 2p b 2 4 6 8 10 5.104078 7.974022 4.294731 4.418012 1. (CXX) g++ options: -fopenmp -lmpi_cxx -lmpi
Embree Binary: Pathtracer - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer - Model: Crown a 5 a 5 b 5 2p a 5 2p b 20 40 60 80 100 SE +/- 0.05, N = 3 69.11 43.95 43.98 81.45 81.09 MIN: 68.18 / MAX: 71.75 MIN: 43.49 / MAX: 44.42 MIN: 43.29 / MAX: 44.67 MIN: 80.49 / MAX: 83.06 MIN: 80.03 / MAX: 82.56
Xcompact3d Incompact3d Input: input.i3d 193 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 193 Cells Per Direction b 5 a 5 2p a 5 2p b 5 10 15 20 25 17.21 19.63 10.62 10.76 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Pennant Test: sedovbig OpenBenchmarking.org Hydro Cycle Time - Seconds, Fewer Is Better Pennant 1.0.1 Test: sedovbig b 5 a 5 2p a 5 2p b 4 8 12 16 20 9.563610 14.001110 7.656369 7.793325 1. (CXX) g++ options: -fopenmp -lmpi_cxx -lmpi
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer ISPC - Model: Crown a b 5 a 5 b 5 2p a 5 2p b 16 32 48 64 80 SE +/- 0.11, N = 3 63.37 63.36 40.24 40.12 73.33 73.26 MIN: 62.18 / MAX: 66.58 MIN: 62.44 / MAX: 66.52 MIN: 39.64 / MAX: 40.95 MIN: 39.64 / MAX: 40.56 MIN: 71.99 / MAX: 75.06 MIN: 72.13 / MAX: 74.82
ACES DGEMM Sustained Floating-Point Rate OpenBenchmarking.org GFLOP/s, More Is Better ACES DGEMM 1.0 Sustained Floating-Point Rate a b 5 a 5 b 5 2p a 5 2p b 7 14 21 28 35 SE +/- 0.25, N = 15 29.38 29.05 17.87 18.64 31.34 31.76 1. (CC) gcc options: -O3 -march=native -fopenmp
SPECFEM3D Model: Tomographic Model OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Tomographic Model a b 5 a 5 b 5 2p a 5 2p b 5 10 15 20 25 SE +/- 0.15, N = 3 13.86 13.53 19.09 19.63 11.22 11.80 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
SPECFEM3D Model: Homogeneous Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Homogeneous Halfspace a b 5 a 5 b 5 2p a 5 2p b 6 12 18 24 30 SE +/- 0.20, N = 3 17.21 16.82 24.99 25.09 14.49 14.42 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU b 5 a 5 2p a 5 2p b 0.258 0.516 0.774 1.032 1.29 0.827622 1.146560 0.674336 0.669349 MIN: 0.78 MIN: 1.09 MIN: 0.62 MIN: 0.6 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
ASKAP Test: tConvolve OpenMP - Gridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve OpenMP - Gridding b 5 a 5 2p a 5 2p b 4K 8K 12K 16K 20K 20481.2 17750.4 12102.5 12678.9 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU b 5 a 5 2p a 5 2p b 0.7031 1.4062 2.1093 2.8124 3.5155 3.12487 2.76234 1.85167 1.89859 MIN: 2.06 MIN: 2.59 MIN: 1.61 MIN: 1.58 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Timed LLVM Compilation Build System: Ninja OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 16.0 Build System: Ninja a b 5 a 5 b 5 2p a 5 2p b 50 100 150 200 250 SE +/- 0.08, N = 3 165.71 164.45 230.38 231.09 138.72 138.57
GROMACS Implementation: MPI CPU - Input: water_GMX50_bare OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2023 Implementation: MPI CPU - Input: water_GMX50_bare a b 5 a 5 2p a 5 2p b 2 4 6 8 10 SE +/- 0.026, N = 3 7.290 7.299 5.022 8.285 8.222 1. (CXX) g++ options: -O3
Embree Binary: Pathtracer ISPC - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer ISPC - Model: Asian Dragon Obj a b 5 a 5 b 5 2p a 5 2p b 14 28 42 56 70 SE +/- 0.03, N = 3 64.51 64.64 39.19 39.22 64.41 64.04 MIN: 63.96 / MAX: 66.18 MIN: 64.13 / MAX: 65.46 MIN: 38.97 / MAX: 39.59 MIN: 39 / MAX: 39.63 MIN: 63.78 / MAX: 65.49 MIN: 63.29 / MAX: 64.92
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer ISPC - Model: Asian Dragon a b 5 a 5 b 5 2p a 5 2p b 16 32 48 64 80 SE +/- 0.01, N = 3 73.84 73.83 44.91 45.11 73.96 73.78 MIN: 73.31 / MAX: 75.9 MIN: 73.34 / MAX: 74.66 MIN: 44.68 / MAX: 45.26 MIN: 44.88 / MAX: 45.48 MIN: 73.12 / MAX: 75.27 MIN: 72.91 / MAX: 74.82
Embree Binary: Pathtracer - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer - Model: Asian Dragon a 5 a 5 b 5 2p a 5 2p b 20 40 60 80 100 SE +/- 0.14, N = 3 78.48 48.52 48.51 76.81 76.63 MIN: 77.64 / MAX: 80.46 MIN: 48.32 / MAX: 48.91 MIN: 48.29 / MAX: 48.79 MIN: 76.07 / MAX: 77.73 MIN: 75.97 / MAX: 78.17
Embree Binary: Pathtracer - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer - Model: Asian Dragon Obj a 5 a 5 b 5 2p a 5 2p b 16 32 48 64 80 SE +/- 0.08, N = 3 70.52 43.96 44.19 68.69 68.70 MIN: 69.67 / MAX: 71.94 MIN: 43.73 / MAX: 44.3 MIN: 43.94 / MAX: 44.51 MIN: 68.03 / MAX: 70.15 MIN: 67.96 / MAX: 69.64
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: vgg16 b 5 a 5 2p a 5 2p b 8 16 24 32 40 24.04 21.08 33.55 32.81 MIN: 23.49 / MAX: 30.66 MIN: 20.78 / MAX: 24.35 MIN: 28.65 / MAX: 42.53 MIN: 30.06 / MAX: 44.76 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU b 5 a 5 2p a 5 2p b 8K 16K 24K 32K 40K 35737.82 25209.62 39455.86 38800.34 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenCV Test: Stitching OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Stitching b 5 a 5 2p b 60K 120K 180K 240K 300K 201160 184032 287492 1. (CXX) g++ options: -fPIC -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -shared
OpenFOAM Input: drivaerFastback, Small Mesh Size - Execution Time OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Execution Time a b 5 a 5 b 5 2p a 5 2p b 11 22 33 44 55 40.45 40.24 50.75 50.70 33.00 32.97 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm
ASKAP Test: tConvolve MT - Degridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve MT - Degridding b 5 a 5 2p a 5 2p b 3K 6K 9K 12K 15K 8308.33 9611.05 12506.70 11068.80 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
7-Zip Compression Test: Compression Rating OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 22.01 Test: Compression Rating b 5 a 5 2p a 5 2p b 90K 180K 270K 360K 450K 396928 271918 403739 396844 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU b 5 a 5 2p b 9K 18K 27K 36K 45K 38091.46 27340.96 40352.48 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
ASKAP Test: tConvolve MT - Gridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve MT - Gridding b 5 a 5 2p a 5 2p b 1700 3400 5100 6800 8500 5493.35 6599.68 7738.59 7176.41 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16 - Device: CPU b 5 a 5 2p a 5 2p b 6 12 18 24 30 26.54 19.00 19.58 19.69 MIN: 14.19 / MAX: 63.19 MIN: 13.62 / MAX: 32.75 MIN: 11.25 / MAX: 73.87 MIN: 11.03 / MAX: 75.83 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU b 5 a 5 2p b 0.3645 0.729 1.0935 1.458 1.8225 1.62 1.16 1.33 MIN: 0.69 / MAX: 13.34 MIN: 0.66 / MAX: 12.23 MIN: 0.64 / MAX: 26.34 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVKL Benchmark: vklBenchmark ISPC OpenBenchmarking.org Items / Sec, More Is Better OpenVKL 1.3.1 Benchmark: vklBenchmark ISPC a b 5 a 5 b 5 2p a 5 2p b 100 200 300 400 500 SE +/- 0.33, N = 3 470 469 340 339 452 452 MIN: 84 / MAX: 2616 MIN: 84 / MAX: 2565 MIN: 55 / MAX: 2309 MIN: 54 / MAX: 2307 MIN: 98 / MAX: 1875 MIN: 99 / MAX: 2013
CloverLeaf Lagrangian-Eulerian Hydrodynamics OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf Lagrangian-Eulerian Hydrodynamics b 5 a 5 2p a 5 2p b 4 8 12 16 20 11.34 12.09 15.67 15.51 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU b 5 a 5 2p a 5 2p b 0.3915 0.783 1.1745 1.566 1.9575 1.74 1.26 1.37 1.40 MIN: 0.84 / MAX: 14.57 MIN: 0.69 / MAX: 12.96 MIN: 0.67 / MAX: 28.86 MIN: 0.68 / MAX: 42.1 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16 - Device: CPU b 5 a 5 2p a 5 2p b 6 12 18 24 30 26.68 19.45 20.06 20.07 MIN: 15.32 / MAX: 46.54 MIN: 11.98 / MAX: 28.4 MIN: 11.41 / MAX: 47.48 MIN: 13.26 / MAX: 75.83 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Detection FP32 - Device: CPU b 5 a 5 2p a 5 2p b 800 1600 2400 3200 4000 3927.00 2886.12 2946.97 2952.76 MIN: 3402.58 / MAX: 4474.47 MIN: 1694.68 / MAX: 3104.67 MIN: 2004.15 / MAX: 3534.38 MIN: 2193.59 / MAX: 3652.32 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Detection FP16 - Device: CPU b 5 a 5 2p a 5 2p b 800 1600 2400 3200 4000 3911.97 2881.04 2975.15 2948.39 MIN: 3337.4 / MAX: 4451.63 MIN: 1536.21 / MAX: 3142.35 MIN: 2241.37 / MAX: 3616.82 MIN: 1547.54 / MAX: 3537.3 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
Timed LLVM Compilation Build System: Unix Makefiles OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 16.0 Build System: Unix Makefiles a b 5 a 5 b 5 2p a 5 2p b 60 120 180 240 300 SE +/- 0.96, N = 3 246.96 248.48 290.54 297.22 219.40 221.79
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Face Detection FP16 - Device: CPU b 5 a 5 2p a 5 2p b 600 1200 1800 2400 3000 2608.34 1935.60 2013.52 2020.21 MIN: 2421.38 / MAX: 2754.89 MIN: 1852.05 / MAX: 1974.19 MIN: 1890.96 / MAX: 2802.82 MIN: 1823.96 / MAX: 3111.51 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 6.0 Time To Compile a b 5 a 5 b 5 2p a 5 2p b 5 10 15 20 25 SE +/- 0.03, N = 3 16.93 17.04 20.38 20.64 15.46 15.34
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.5 Encoder Mode: Preset 12 - Input: Bosphorus 4K a b 5 a 5 b 5 2p a 5 2p b 50 100 150 200 250 SE +/- 0.73, N = 3 222.84 214.93 220.81 226.73 169.11 176.11 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Vehicle Bike Detection FP16 - Device: CPU b 5 a 5 2p a 5 2p b 4 8 12 16 20 16.93 12.77 12.66 12.66 MIN: 13.88 / MAX: 33.11 MIN: 7.39 / MAX: 23.62 MIN: 7.58 / MAX: 53.45 MIN: 8.38 / MAX: 48.88 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Machine Translation EN To DE FP16 - Device: CPU b 5 a 5 2p a 5 2p b 50 100 150 200 250 230.27 172.53 182.10 180.91 MIN: 166.99 / MAX: 311.89 MIN: 81.35 / MAX: 207.95 MIN: 117.14 / MAX: 548.18 MIN: 124.49 / MAX: 288.01 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16-INT8 - Device: CPU b 5 a 5 2p a 5 2p b 4 8 12 16 20 16.93 12.84 12.76 12.79 MIN: 10.73 / MAX: 31.24 MIN: 6.96 / MAX: 23.28 MIN: 7.65 / MAX: 43.58 MIN: 7.68 / MAX: 43.18 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Face Detection FP16-INT8 - Device: CPU b 5 a 5 2p a 5 2p b 300 600 900 1200 1500 1175.01 886.41 914.24 912.42 MIN: 982.69 / MAX: 1202.21 MIN: 851.09 / MAX: 900.44 MIN: 797.83 / MAX: 966.84 MIN: 878.63 / MAX: 988.74 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16-INT8 - Device: CPU b 5 a 5 2p a 5 2p b 6 12 18 24 30 23.92 18.10 18.26 18.25 MIN: 15.02 / MAX: 35.74 MIN: 9.23 / MAX: 28.11 MIN: 10.53 / MAX: 60.19 MIN: 8.84 / MAX: 40.73 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
VVenC Video Input: Bosphorus 1080p - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.8 Video Input: Bosphorus 1080p - Video Preset: Faster a b 5 a 5 b 5 2p a 5 2p b 7 14 21 28 35 SE +/- 0.03, N = 3 30.01 30.11 31.84 31.80 24.49 24.27 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
VVenC Video Input: Bosphorus 4K - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.8 Video Input: Bosphorus 4K - Video Preset: Faster a b 5 a 5 b 5 2p a 5 2p b 3 6 9 12 15 SE +/- 0.022, N = 3 11.367 11.364 11.961 12.020 9.574 9.906 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
ASKAP Test: tConvolve OpenMP - Degridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve OpenMP - Degridding b 5 a 5 2p a 5 2p b 4K 8K 12K 16K 20K 19018.3 19018.3 16641.0 15662.1 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
FFmpeg Encoder: libx265 - Scenario: Live OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Live a b 5 a 5 b 5 2p a 5 2p b 20 40 60 80 100 SE +/- 0.29, N = 3 105.78 106.98 110.49 110.29 92.52 107.29 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Live OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Live a b 5 a 5 b 5 2p a 5 2p b 12 24 36 48 60 SE +/- 0.13, N = 3 47.74 47.21 45.71 45.79 54.58 47.07 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.5 Encoder Mode: Preset 13 - Input: Bosphorus 4K a b 5 a 5 b 5 2p a 5 2p b 40 80 120 160 200 SE +/- 0.70, N = 3 195.92 198.14 199.69 196.06 169.66 172.57 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU b 5 a 5 2p a 5 2p b 300 600 900 1200 1500 1183.63 1182.61 1390.36 1385.33 MIN: 1162.73 MIN: 1161.83 MIN: 1287.43 MIN: 1314.76 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.5 Encoder Mode: Preset 8 - Input: Bosphorus 4K a b 5 a 5 b 5 2p a 5 2p b 16 32 48 64 80 SE +/- 0.56, N = 12 73.44 67.55 65.34 70.86 62.84 69.54 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
VVenC Video Input: Bosphorus 1080p - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.8 Video Input: Bosphorus 1080p - Video Preset: Fast a b 5 a 5 b 5 2p a 5 2p b 4 8 12 16 20 SE +/- 0.10, N = 3 16.55 16.37 16.85 17.25 14.81 15.04 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.5 Encoder Mode: Preset 12 - Input: Bosphorus 1080p a b 5 a 5 b 5 2p a 5 2p b 140 280 420 560 700 SE +/- 2.97, N = 3 604.84 601.98 629.30 622.73 540.73 565.06 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
VVenC Video Input: Bosphorus 4K - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.8 Video Input: Bosphorus 4K - Video Preset: Fast a b 5 a 5 b 5 2p a 5 2p b 2 4 6 8 10 SE +/- 0.010, N = 3 6.286 6.128 6.406 6.390 5.548 5.768 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
NCNN Target: CPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: vision_transformer b 5 a 5 2p a 5 2p b 30 60 90 120 150 133.39 126.26 144.31 145.49 MIN: 129.7 / MAX: 252.77 MIN: 125.42 / MAX: 132.03 MIN: 140.5 / MAX: 157.63 MIN: 141.27 / MAX: 245.34 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Zstd Compression Compression Level: 3, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3, Long Mode - Compression Speed b 5 a 5 2p a 5 2p b 200 400 600 800 1000 752.6 854.2 808.5 741.5 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
OpenFOAM Input: drivaerFastback, Small Mesh Size - Mesh Time OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Mesh Time a b 5 a 5 b 5 2p a 5 2p b 6 12 18 24 30 24.73 25.60 22.34 22.71 22.56 23.34 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm
Zstd Compression Compression Level: 8, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8, Long Mode - Compression Speed b 5 a 5 2p a 5 2p b 200 400 600 800 1000 744.5 792.1 702.8 702.2 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.5 Encoder Mode: Preset 13 - Input: Bosphorus 1080p a b 5 a 5 b 5 2p a 5 2p b 120 240 360 480 600 SE +/- 5.77, N = 3 552.24 549.83 571.08 560.57 525.15 546.03 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Zstd Compression Compression Level: 8 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8 - Compression Speed b 5 a 5 2p a 5 2p b 200 400 600 800 1000 988.5 1067.8 1024.0 1013.4 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
FFmpeg Encoder: libx265 - Scenario: Video On Demand OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Video On Demand a b 5 a 5 2p a 5 2p b 40 80 120 160 200 SE +/- 0.15, N = 3 173.47 172.53 161.54 168.46 173.05 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Video On Demand OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Video On Demand a b 5 a 5 2p a 5 2p b 11 22 33 44 55 SE +/- 0.04, N = 3 43.67 43.90 46.89 44.97 43.77 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Platform OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Platform a b 5 a 5 2p a 5 2p b 11 22 33 44 55 SE +/- 0.02, N = 3 43.73 43.90 46.70 44.31 43.64 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Platform OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Platform a b 5 a 5 2p a 5 2p b 40 80 120 160 200 SE +/- 0.08, N = 3 173.23 172.57 162.19 170.94 173.56 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Zstd Compression Compression Level: 12 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 12 - Compression Speed b 5 a 5 2p a 5 2p b 60 120 180 240 300 270.8 289.1 283.6 282.8 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
ClickHouse 100M Rows Hits Dataset, Second Run OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, Second Run b 5 a 5 2p a 5 2p b 100 200 300 400 500 439.07 457.81 463.68 467.46 MIN: 35.82 / MAX: 6000 MIN: 24.13 / MAX: 5454.55 MIN: 41.64 / MAX: 4615.38 MIN: 40.6 / MAX: 4615.38
FFmpeg Encoder: libx265 - Scenario: Upload OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Upload a b 5 a 5 b 5 2p a 5 2p b 5 10 15 20 25 SE +/- 0.01, N = 3 21.59 21.62 22.95 22.93 22.38 22.10 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Upload OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Upload a b 5 a 5 b 5 2p a 5 2p b 30 60 90 120 150 SE +/- 0.05, N = 3 116.93 116.81 110.01 110.14 112.82 114.26 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
QuantLib OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.30 a b 5 a 5 b 5 2p a 5 2p b 600 1200 1800 2400 3000 SE +/- 25.52, N = 3 2746.5 2692.4 2756.9 2820.4 2760.9 2850.8 1. (CXX) g++ options: -O3 -march=native -fPIE -pie
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.5 Encoder Mode: Preset 4 - Input: Bosphorus 1080p a b 5 a 5 b 5 2p a 5 2p b 3 6 9 12 15 SE +/- 0.02, N = 3 10.30 10.23 10.76 10.83 10.41 10.50 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.5 Encoder Mode: Preset 8 - Input: Bosphorus 1080p a b 5 a 5 b 5 2p a 5 2p b 20 40 60 80 100 SE +/- 1.23, N = 5 109.98 109.71 107.38 108.36 107.10 104.17 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
ClickHouse 100M Rows Hits Dataset, Third Run OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, Third Run b 5 a 5 2p a 5 2p b 100 200 300 400 500 437.35 456.30 461.53 448.03 MIN: 35.59 / MAX: 5454.55 MIN: 24.65 / MAX: 5454.55 MIN: 41.49 / MAX: 3000 MIN: 41.38 / MAX: 2608.7
Zstd Compression Compression Level: 3, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3, Long Mode - Decompression Speed b 5 a 5 2p a 5 2p b 300 600 900 1200 1500 1300.8 1361.5 1332.3 1336.5 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
ClickHouse 100M Rows Hits Dataset, First Run / Cold Cache OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, First Run / Cold Cache b 5 a 5 2p a 5 2p b 100 200 300 400 500 428.51 437.20 445.18 440.56 MIN: 34.8 / MAX: 5454.55 MIN: 24.3 / MAX: 6000 MIN: 41.18 / MAX: 3157.89 MIN: 40.98 / MAX: 4000
Zstd Compression Compression Level: 3 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3 - Decompression Speed b 5 a 5 2p a 5 2p b 300 600 900 1200 1500 1277.3 1326.5 1306.7 1309.5 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 8 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8 - Decompression Speed b 5 a 5 2p a 5 2p b 300 600 900 1200 1500 1390.0 1438.8 1437.5 1441.1 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 12 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 12 - Decompression Speed b 5 a 5 2p a 5 2p b 300 600 900 1200 1500 1445.9 1478.1 1482.7 1498.9 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 19 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Compression Speed b 5 a 5 2p a 5 2p b 4 8 12 16 20 17.6 18.2 18.0 18.0 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 19, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Decompression Speed b 5 a 5 2p a 5 2p b 300 600 900 1200 1500 1190.8 1229.1 1222.9 1227.1 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Google Draco Model: Lion OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Lion a b 5 a 5 2p a 5 2p b 1000 2000 3000 4000 5000 SE +/- 7.02, N = 3 4867 4885 4760 4734 4818 1. (CXX) g++ options: -O3
Zstd Compression Compression Level: 19 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Decompression Speed b 5 a 5 2p a 5 2p b 300 600 900 1200 1500 1276.8 1304.2 1315.6 1312.0 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 3 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3 - Compression Speed b 5 a 5 2p a 5 2p b 600 1200 1800 2400 3000 2819.0 2856.6 2783.7 2775.9 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 19, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Compression Speed b 5 a 5 2p a 5 2p b 3 6 9 12 15 9.30 9.54 9.57 9.50 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Google Draco Model: Church Facade OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Church Facade a b 5 a 5 2p a 5 2p b 1300 2600 3900 5200 6500 SE +/- 3.21, N = 3 5793 5833 5718 5682 5729 1. (CXX) g++ options: -O3
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.5 Encoder Mode: Preset 4 - Input: Bosphorus 4K a b 5 a 5 b 5 2p a 5 2p b 1.0942 2.1884 3.2826 4.3768 5.471 SE +/- 0.021, N = 3 4.863 4.826 4.743 4.760 4.784 4.842 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Zstd Compression Compression Level: 8, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8, Long Mode - Decompression Speed b 5 a 5 2p a 5 2p b 300 600 900 1200 1500 1434.2 1460.1 1454.7 1445.1 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
eSpeak-NG Speech Engine Text-To-Speech Synthesis OpenBenchmarking.org Seconds, Fewer Is Better eSpeak-NG Speech Engine 20200907 Text-To-Speech Synthesis b 5 a 5 2p a 5 2p b 7 14 21 28 35 28.16 27.83 27.87 28.14 1. (CC) gcc options: -O2 -std=c99
Phoronix Test Suite v10.8.5