xeon eo march 2 x Intel Xeon Platinum 8380 testing with a Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS) and ASPEED on Ubuntu 22.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2304015-NE-XEONEOMAR92&grs .
xeon eo march Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Vulkan Compiler File-System Screen Resolution a b c 2 x Intel Xeon Platinum 8380 @ 3.40GHz (80 Cores / 160 Threads) Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS) Intel Ice Lake IEH 512GB 7682GB INTEL SSDPF2KX076TZ ASPEED VE228 2 x Intel X710 for 10GBASE-T + 2 x Intel E810-C for QSFP Ubuntu 22.10 6.2.0-rc5-phx-dodt (x86_64) GNOME Shell 43.0 X Server 1.21.1.3 1.3.224 GCC 12.2.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0xd000375 Python Details - Python 3.10.7 Security Details - dodt: Mitigation of DOITM + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected
xeon eo march john-the-ripper: HMAC-SHA512 stress-ng: System V Message Passing john-the-ripper: MD5 stress-ng: Atomic john-the-ripper: Blowfish john-the-ripper: WPA PSK stress-ng: Socket Activity john-the-ripper: bcrypt compress-zstd: 8 - Decompression Speed opencv: Features 2D stress-ng: MEMFD memcached: 1:100 memcached: 1:10 opencv: Image Processing compress-zstd: 3, Long Mode - Compression Speed onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard memcached: 1:5 stress-ng: CPU Cache stress-ng: Zlib daphne: OpenMP - NDT Mapping opencv: Object Detection onednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU compress-zstd: 19, Long Mode - Compression Speed daphne: OpenMP - Points2Image opencv: Graph API compress-zstd: 12 - Compression Speed stress-ng: Futex onnx: ArcFace ResNet-100 - CPU - Standard opencv: Video apache: 200 compress-zstd: 3 - Compression Speed rocksdb: Read While Writing onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU opencv: DNN - Deep Neural Network daphne: OpenMP - Euclidean Cluster opencv: Core onnx: fcn-resnet101-11 - CPU - Parallel compress-zstd: 8, Long Mode - Compression Speed compress-zstd: 8 - Compression Speed onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU aom-av1: Speed 9 Realtime - Bosphorus 1080p tensorflow: CPU - 16 - AlexNet aom-av1: Speed 6 Realtime - Bosphorus 4K aom-av1: Speed 6 Two-Pass - Bosphorus 4K openssl: SHA256 onnx: super-resolution-10 - CPU - Parallel onnx: yolov4 - CPU - Standard stress-ng: Function Call rocksdb: Read Rand Write Rand specfem3d: Tomographic Model aom-av1: Speed 10 Realtime - Bosphorus 4K opencv: Stitching onednn: Recurrent Neural Network Inference - u8s8f32 - CPU stress-ng: Glibc C String Functions tensorflow: CPU - 64 - AlexNet aom-av1: Speed 9 Realtime - Bosphorus 4K specfem3d: Water-layered Halfspace vpxenc: Speed 5 - Bosphorus 4K stress-ng: CPU Stress stress-ng: Semaphores aom-av1: Speed 6 Two-Pass - Bosphorus 1080p stress-ng: Pthread aom-av1: Speed 10 Realtime - Bosphorus 1080p onnx: bertsquad-12 - CPU - Parallel aom-av1: Speed 6 Realtime - Bosphorus 1080p aom-av1: Speed 0 Two-Pass - Bosphorus 4K compress-zstd: 8, Long Mode - Decompression Speed ffmpeg: libx264 - Video On Demand ffmpeg: libx264 - Video On Demand onednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPU vvenc: Bosphorus 1080p - Faster onnx: CaffeNet 12-int8 - CPU - Parallel svt-av1: Preset 12 - Bosphorus 4K build2: Time To Compile stress-ng: Crypto build-nodejs: Time To Compile ffmpeg: libx265 - Live ffmpeg: libx265 - Live onnx: ArcFace ResNet-100 - CPU - Parallel svt-av1: Preset 8 - Bosphorus 4K ffmpeg: libx264 - Live ffmpeg: libx264 - Live onnx: bertsquad-12 - CPU - Standard vvenc: Bosphorus 4K - Fast compress-zstd: 12 - Decompression Speed tensorflow: CPU - 32 - AlexNet compress-zstd: 19 - Compression Speed specfem3d: Mount St. Helens svt-av1: Preset 4 - Bosphorus 4K onnx: GPT-2 - CPU - Standard stress-ng: Hash vpxenc: Speed 5 - Bosphorus 1080p stress-ng: SENDFILE compress-zstd: 19 - Decompression Speed rocksdb: Rand Read openssl: SHA512 onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU rocksdb: Rand Fill Sync onnx: GPT-2 - CPU - Parallel onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU stress-ng: Matrix Math ffmpeg: libx264 - Platform ffmpeg: libx264 - Upload ffmpeg: libx264 - Platform aom-av1: Speed 4 Two-Pass - Bosphorus 4K onnx: ResNet50 v1-12-int8 - CPU - Standard build-godot: Time To Compile ffmpeg: libx264 - Upload onnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallel compress-zstd: 3, Long Mode - Decompression Speed tensorflow: CPU - 16 - GoogLeNet onnx: ResNet50 v1-12-int8 - CPU - Parallel vvenc: Bosphorus 1080p - Fast tensorflow: CPU - 16 - ResNet-50 nginx: 500 dav1d: Summer Nature 4K specfem3d: Homogeneous Halfspace aom-av1: Speed 8 Realtime - Bosphorus 1080p onnx: yolov4 - CPU - Parallel embree: Pathtracer ISPC - Asian Dragon compress-zstd: 19, Long Mode - Decompression Speed dav1d: Chimera 1080p vpxenc: Speed 0 - Bosphorus 1080p tensorflow: CPU - 32 - GoogLeNet stress-ng: Forking stress-ng: NUMA onednn: Recurrent Neural Network Inference - f32 - CPU tensorflow: CPU - 512 - AlexNet onnx: fcn-resnet101-11 - CPU - Standard onnx: CaffeNet 12-int8 - CPU - Standard ffmpeg: libx265 - Upload tensorflow: CPU - 256 - ResNet-50 stress-ng: MMAP ffmpeg: libx265 - Upload onednn: Convolution Batch Shapes Auto - f32 - CPU apache: 500 dav1d: Summer Nature 1080p aom-av1: Speed 8 Realtime - Bosphorus 4K draco: Church Facade build-llvm: Unix Makefiles onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU blender: BMW27 - CPU-Only mysqlslap: 1024 mysqlslap: 2048 openssl: AES-128-GCM mysqlslap: 4096 embree: Pathtracer - Crown embree: Pathtracer ISPC - Crown aom-av1: Speed 4 Two-Pass - Bosphorus 1080p nginx: 200 onednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU draco: Lion gromacs: MPI CPU - water_GMX50_bare tensorflow: CPU - 512 - ResNet-50 tensorflow: CPU - 64 - ResNet-50 vpxenc: Speed 0 - Bosphorus 4K stress-ng: Poll blender: Fishy Cat - CPU-Only blender: Classroom - CPU-Only ffmpeg: libx265 - Video On Demand tensorflow: CPU - 256 - AlexNet ffmpeg: libx265 - Video On Demand embree: Pathtracer - Asian Dragon Obj tensorflow: CPU - 32 - ResNet-50 stress-ng: Mutex vvenc: Bosphorus 4K - Faster build-llvm: Ninja stress-ng: Malloc compress-zstd: 3 - Decompression Speed blender: Barbershop - CPU-Only blender: Pabellon Barcelona - CPU-Only embree: Pathtracer ISPC - Asian Dragon Obj dav1d: Chimera 1080p 10-bit svt-av1: Preset 13 - Bosphorus 4K specfem3d: Layered Halfspace onnx: super-resolution-10 - CPU - Standard tensorflow: CPU - 256 - GoogLeNet embree: Pathtracer - Asian Dragon ffmpeg: libx265 - Platform ffmpeg: libx265 - Platform openssl: AES-256-GCM tensorflow: CPU - 64 - GoogLeNet build-ffmpeg: Time To Compile stress-ng: Memory Copying tensorflow: CPU - 512 - GoogLeNet rocksdb: Rand Fill stress-ng: Glibc Qsort Data Sorting openssl: RSA4096 openssl: RSA4096 stress-ng: IO_uring openssl: ChaCha20-Poly1305 stress-ng: Vector Math rocksdb: Seq Fill rocksdb: Update Rand openssl: ChaCha20 stress-ng: Context Switching stress-ng: x86_64 RdRand mysqlslap: 512 aom-av1: Speed 0 Two-Pass - Bosphorus 1080p onednn: Recurrent Neural Network Training - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: IP Shapes 3D - bf16bf16bf16 - CPU onednn: IP Shapes 1D - bf16bf16bf16 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - f32 - CPU onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard onnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallel onnx: super-resolution-10 - CPU - Standard onnx: super-resolution-10 - CPU - Parallel onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: ResNet50 v1-12-int8 - CPU - Parallel onnx: ArcFace ResNet-100 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Parallel onnx: fcn-resnet101-11 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Parallel onnx: CaffeNet 12-int8 - CPU - Standard onnx: CaffeNet 12-int8 - CPU - Parallel onnx: bertsquad-12 - CPU - Standard onnx: bertsquad-12 - CPU - Parallel onnx: yolov4 - CPU - Standard onnx: yolov4 - CPU - Parallel onnx: GPT-2 - CPU - Standard onnx: GPT-2 - CPU - Parallel nginx: 100 a b c 151993000 10295375.81 10384000 139.8 114499 483328 94302.89 114777 1128.8 270872 766.9 1970590.49 2137824.96 356989 258 26.5412 1774418.86 1441584.05 6374.22 817.52 119494 3.67056 7.88 9474.39 671086 161.4 957096.2 36.1961 121173 34680.57 2013.1 8059987 0.201235 705.678 115385 842.32 88425 3.1156 295 624 711.380 0.875338 31.7 138.07 16.15 8.16 57611766850 61.7891 11.3641 429335.83 920426 15.298794168 15.82 508105 453.911 60467587.4 236.08 16.04 31.466335812 5.2 152655.29 11840389.35 15.97 87858.71 30.04 13.2549 28.78 0.33 1161.6 36.21 209.220285516 2.10120 13.353 624.463 116.885 93.004 105894.93 203.675 33.78 149.480929923 32.6794 37.744 133.05 37.96 16.1257 3.261 1095.6 178.52 14.6 13.743206283 2.375 183.295 12949136.28 11.42 1143575.46 907.7 273039094 22617583540 455.595 73440 92.431 0.373237 335705.12 212.621669857 9.91 35.63 5.01 219.832 148.771 254.898947776 22.2889 1190.2 73.81 190.263 7.998 23.33 222945.56 66.17 18.02764317 29.39 10.1914 103.136 917.2 134.66 5.71 100.57 65824.56 437.26 466.729 518.73 9.37499 699.963 8.62 60.21 3124.36 292.801657432 1.41955 37442.6 96.8 15.39 6765 286.217 1.17192 24.2 110 111 799202790060 115 71.2659 87.9967 8.82 245504.32 3.90427 5502 9.268 71.38 38.9 2.88 9551392.98 32.22 63.07 448.20691755 408.72 16.90 76.1659 30.87 36280764.76 5.029 180.771 190168476.52 1128.4 244.2 76.92 89.1801 130.69 97.161 29.635032517 196.246 214.3 84.3522 448.633638741 16.88 709965487310 132.43 17.808 10941.37 257.44 104159 1550.53 36965 1189343.7 26355.62 293957452870 306730.85 106061 97772 425091630970 3511158.35 658591.72 110 0.73 708.530 35.4599 2.30887 18.01146 30.271432 2.56877 2.03996 1.46571 37.6739 44.8631 5.09499 16.1828 4.54812 5.25466 27.6247 30.5988 106.662 320.962 1.42719 1.59956 62.0109 75.4412 87.992 98.1183 5.45281 10.8107 152981000 10283641.13 10295000 163.26 107388 482323 79477.34 114825 1140.4 356993 583.83 1946110.38 1736416.84 293481 313.7 32.0474 2133650.03 1691720.8 5549.21 771.26 103986 4.16467 7.25 9912.90 597049 180.6 1018129.24 33.6787 133100 38092.88 2185 7531197 0.21833 656.795 107827 796.63 83121 2.93086 278.4 603 752.425 0.925478 31.1 145.79 15.53 8.4 54810975320 58.8271 11.8969 413303.53 959340 14.714132502 16.43 489265 437.395 59352709.91 228.16 16.56 30.372995715 5.04 152455.93 12234144.91 15.58 87532.4 30.21 13.0285 27.93 0.34 1163.4 35.16 215.44 2.16346 13.1 607.235 119.691 94.969 108459.26 202.826 34.57 146.069290258 32.6796 38.178 130.09 38.82 16.3145 3.221 1075.5 174.81 14.3 13.461312471 2.336 186.943 13204487.51 11.53 1162795.27 924 268195000 22641175270 447.363 74771 91.7637 0.37954 341137.35 212.54 9.97 35.64 4.93 216.364 148.85 253.32 21.9478 1172.3 72.81 187.423 8.014 22.99 226231.19 65.21 18.182305176 29.81 10.1411 104.4793 928.7 133.01 5.66 101.79 66467.95 432.4 471.804 513.22 9.47539 707.306 8.66 60.09 3115.23 291.569506705 1.40563 37076.95 96.07 15.36 6702 286.237 1.16101 24.36 110 112 802646430240 115 71.8254 88.571 8.79 244509.87 3.93436 5479 9.199 71.21 38.63 2.9 9566416.75 32.29 62.71 448.99 407.51 16.87 76.6639 30.8 36505304.6 4.998 181.169 191206280.2 1132.3 243.71 77.29 89.2864 130.12 97.19 29.523601764 197.018 213.49 84.2763 448.26 16.90 712463205990 132.26 17.865 10922.24 258.17 104356 1546.69 36899.1 1186897.4 26364.63 294463651570 306299.59 106147 97861 425550559910 3508321.29 658461.44 110 0.73 737.563 39.8033 2.21121 5.37587 0.668506 10.0555 1.80895 1.6588 31.2002 45.5602 5.075 16.9976 4.62093 5.33429 29.689 30.5984 105.532 341.193 1.41212 1.64503 61.2937 76.7518 84.0506 98.6054 5.34648 10.8897 19292000 73426960.15 2766000 349.89 51517 255829 50201.57 64015 831.7 583.2 1527158.77 2222180.6 267.4 1820534.68 1657780.82 6511.98 892.13 6.95 8791.23 169.9 1056011.64 37.0538 2209.8 8192793 848.69 2.93257 285.6 638.2 30.01 145.45 16.33 7.99 57348072400 11.6941 431183.83 954852 14.958818098 16.33 58278938.78 227.64 16.63 30.958272243 5.02 157803.19 12232435.5 16.07 90272.05 30.98 13.4286 28.61 0.34 1129.3 35.33 214.38 12.977 624.35 120.064 92.475 107612.44 207.61 34.24 147.51 31.9347 37.328 130.88 38.58 15.9607 3.191 1072.1 175.31 14.4 13.567236154 2.327 184.883 13066446.38 11.31 1165355.2 907 273197200 22228899690 74573 93.3205 341249.75 209.19 9.81 36.21 4.95 146.51 257.31 1189.7 72.7 8.119 23.23 225697.84 65.7 18.287061065 29.56 10.0594 104.2777 920.9 134.1 5.73 101.18 65696.65 433.52 514.4 9.4022 705.795 8.71 60.71 3092.95 289.92 95.86 15.51 6767 288.948 24.42 111 111 806241823680 114 71.888 87.85 8.86 246434.82 5521 9.259 71.74 38.71 2.88 9500903.35 32.07 63.14 451.26 410.27 16.79 76.4083 31 36276764.04 5.012 180.059 190376822.58 1126.3 244.89 77.15 88.8879 130.27 97.574 29.647600618 213.69 84.5925 449.86 16.84 710407167180 131.98 17.837 10908.2 258.13 104086 1547.2 36977.5 1188772.4 26405.01 294001151180 306225.09 106227 97722 425195437530 3511599.94 658542.68 110 0.73 26.9844 31.3122 106.353 340.994 1.41518 1.59992 62.6519 74.4648 85.5088 99.4055 5.4061 10.7076 OpenBenchmarking.org
John The Ripper Test: HMAC-SHA512 OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: HMAC-SHA512 a b c 30M 60M 90M 120M 150M 151993000 152981000 19292000 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt -lbz2
Stress-NG Test: System V Message Passing OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: System V Message Passing a b c 16M 32M 48M 64M 80M 10295375.81 10283641.13 73426960.15 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
John The Ripper Test: MD5 OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: MD5 a b c 2M 4M 6M 8M 10M 10384000 10295000 2766000 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt -lbz2
Stress-NG Test: Atomic OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Atomic a b c 80 160 240 320 400 139.80 163.26 349.89 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
John The Ripper Test: Blowfish OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: Blowfish a b c 20K 40K 60K 80K 100K 114499 107388 51517 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt -lbz2
John The Ripper Test: WPA PSK OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: WPA PSK a b c 100K 200K 300K 400K 500K 483328 482323 255829 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt -lbz2
Stress-NG Test: Socket Activity OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Socket Activity a b c 20K 40K 60K 80K 100K 94302.89 79477.34 50201.57 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
John The Ripper Test: bcrypt OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: bcrypt a b c 20K 40K 60K 80K 100K 114777 114825 64015 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt -lbz2
Zstd Compression Compression Level: 8 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8 - Decompression Speed a b c 200 400 600 800 1000 1128.8 1140.4 831.7 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
OpenCV Test: Features 2D OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Features 2D a b 80K 160K 240K 320K 400K 270872 356993 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
Stress-NG Test: MEMFD OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: MEMFD a b c 170 340 510 680 850 766.90 583.83 583.20 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Memcached Set To Get Ratio: 1:100 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.19 Set To Get Ratio: 1:100 a b c 400K 800K 1200K 1600K 2000K 1970590.49 1946110.38 1527158.77 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
Memcached Set To Get Ratio: 1:10 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.19 Set To Get Ratio: 1:10 a b c 500K 1000K 1500K 2000K 2500K 2137824.96 1736416.84 2222180.60 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
OpenCV Test: Image Processing OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Image Processing a b 80K 160K 240K 320K 400K 356989 293481 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
Zstd Compression Compression Level: 3, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3, Long Mode - Compression Speed a b c 70 140 210 280 350 258.0 313.7 267.4 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard a b 7 14 21 28 35 26.54 32.05 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
Memcached Set To Get Ratio: 1:5 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.19 Set To Get Ratio: 1:5 a b c 500K 1000K 1500K 2000K 2500K 1774418.86 2133650.03 1820534.68 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
Stress-NG Test: CPU Cache OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: CPU Cache a b c 400K 800K 1200K 1600K 2000K 1441584.05 1691720.80 1657780.82 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Stress-NG Test: Zlib OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Zlib a b c 1400 2800 4200 5600 7000 6374.22 5549.21 6511.98 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Darmstadt Automotive Parallel Heterogeneous Suite Backend: OpenMP - Kernel: NDT Mapping OpenBenchmarking.org Test Cases Per Minute, More Is Better Darmstadt Automotive Parallel Heterogeneous Suite 2021.11.02 Backend: OpenMP - Kernel: NDT Mapping a b c 200 400 600 800 1000 817.52 771.26 892.13 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp
OpenCV Test: Object Detection OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Object Detection a b 30K 60K 90K 120K 150K 119494 103986 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU a b 0.9371 1.8742 2.8113 3.7484 4.6855 SE +/- 0.04476, N = 14 3.67056 4.16467 MIN: 3.52 MIN: 3.58 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Zstd Compression Compression Level: 19, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Compression Speed a b c 2 4 6 8 10 7.88 7.25 6.95 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Darmstadt Automotive Parallel Heterogeneous Suite Backend: OpenMP - Kernel: Points2Image OpenBenchmarking.org Test Cases Per Minute, More Is Better Darmstadt Automotive Parallel Heterogeneous Suite 2021.11.02 Backend: OpenMP - Kernel: Points2Image a b c 2K 4K 6K 8K 10K 9474.39 9912.90 8791.23 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp
OpenCV Test: Graph API OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Graph API a b 140K 280K 420K 560K 700K 671086 597049 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
Zstd Compression Compression Level: 12 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 12 - Compression Speed a b c 40 80 120 160 200 161.4 180.6 169.9 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Stress-NG Test: Futex OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Futex a b c 200K 400K 600K 800K 1000K 957096.20 1018129.24 1056011.64 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard a b c 9 18 27 36 45 36.20 33.68 37.05 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenCV Test: Video OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Video a b 30K 60K 90K 120K 150K 121173 133100 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
Apache HTTP Server Concurrent Requests: 200 OpenBenchmarking.org Requests Per Second, More Is Better Apache HTTP Server 2.4.56 Concurrent Requests: 200 a b 8K 16K 24K 32K 40K 34680.57 38092.88 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
Zstd Compression Compression Level: 3 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3 - Compression Speed a b c 500 1000 1500 2000 2500 2013.1 2185.0 2209.8 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
RocksDB Test: Read While Writing OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Read While Writing a b c 2M 4M 6M 8M 10M 8059987 7531197 8192793 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU a b 0.0491 0.0982 0.1473 0.1964 0.2455 SE +/- 0.001478, N = 11 0.201235 0.218330 MIN: 0.18 MIN: 0.19 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU a b 150 300 450 600 750 SE +/- 9.93, N = 15 705.68 656.80 MIN: 626 MIN: 632.15 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenCV Test: DNN - Deep Neural Network OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: DNN - Deep Neural Network a b 20K 40K 60K 80K 100K 115385 107827 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
Darmstadt Automotive Parallel Heterogeneous Suite Backend: OpenMP - Kernel: Euclidean Cluster OpenBenchmarking.org Test Cases Per Minute, More Is Better Darmstadt Automotive Parallel Heterogeneous Suite 2021.11.02 Backend: OpenMP - Kernel: Euclidean Cluster a b c 200 400 600 800 1000 842.32 796.63 848.69 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp
OpenCV Test: Core OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Core a b 20K 40K 60K 80K 100K 88425 83121 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel a b c 0.701 1.402 2.103 2.804 3.505 3.11560 2.93086 2.93257 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
Zstd Compression Compression Level: 8, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8, Long Mode - Compression Speed a b c 60 120 180 240 300 295.0 278.4 285.6 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 8 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8 - Compression Speed a b c 140 280 420 560 700 624.0 603.0 638.2 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU a b 160 320 480 640 800 SE +/- 2.91, N = 3 711.38 752.43 MIN: 680.93 MIN: 723.13 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU a b 0.2082 0.4164 0.6246 0.8328 1.041 SE +/- 0.008834, N = 6 0.875338 0.925478 MIN: 0.82 MIN: 0.82 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
AOM AV1 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.6 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 1080p a b c 7 14 21 28 35 31.70 31.10 30.01 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
TensorFlow Device: CPU - Batch Size: 16 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: AlexNet a b c 30 60 90 120 150 138.07 145.79 145.45
AOM AV1 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.6 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 4K a b c 4 8 12 16 20 16.15 15.53 16.33 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
AOM AV1 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.6 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 4K a b c 2 4 6 8 10 8.16 8.40 7.99 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
OpenSSL Algorithm: SHA256 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: SHA256 a b c 12000M 24000M 36000M 48000M 60000M 57611766850 54810975320 57348072400 1. (CC) gcc options: -pthread -m64 -O3 -ldl
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Parallel a b 14 28 42 56 70 61.79 58.83 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: yolov4 - Device: CPU - Executor: Standard a b c 3 6 9 12 15 11.36 11.90 11.69 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
Stress-NG Test: Function Call OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Function Call a b c 90K 180K 270K 360K 450K 429335.83 413303.53 431183.83 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
RocksDB Test: Read Random Write Random OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Read Random Write Random a b c 200K 400K 600K 800K 1000K 920426 959340 954852 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
SPECFEM3D Model: Tomographic Model OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Tomographic Model a b c 4 8 12 16 20 15.30 14.71 14.96 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
AOM AV1 Encoder Mode: Speed 10 Realtime - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.6 Encoder Mode: Speed 10 Realtime - Input: Bosphorus 4K a b c 4 8 12 16 20 15.82 16.43 16.33 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
OpenCV Test: Stitching OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Stitching a b 110K 220K 330K 440K 550K 508105 489265 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU a b 100 200 300 400 500 SE +/- 2.89, N = 3 453.91 437.40 MIN: 437.57 MIN: 425.41 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Stress-NG Test: Glibc C String Functions OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Glibc C String Functions a b c 13M 26M 39M 52M 65M 60467587.40 59352709.91 58278938.78 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
TensorFlow Device: CPU - Batch Size: 64 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 64 - Model: AlexNet a b c 50 100 150 200 250 236.08 228.16 227.64
AOM AV1 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.6 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 4K a b c 4 8 12 16 20 16.04 16.56 16.63 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
SPECFEM3D Model: Water-layered Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Water-layered Halfspace a b c 7 14 21 28 35 31.47 30.37 30.96 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
VP9 libvpx Encoding Speed: Speed 5 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better VP9 libvpx Encoding 1.13 Speed: Speed 5 - Input: Bosphorus 4K a b c 1.17 2.34 3.51 4.68 5.85 5.20 5.04 5.02 1. (CXX) g++ options: -m64 -lm -lpthread -O3 -fPIC -U_FORTIFY_SOURCE -std=gnu++11
Stress-NG Test: CPU Stress OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: CPU Stress a b c 30K 60K 90K 120K 150K 152655.29 152455.93 157803.19 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Stress-NG Test: Semaphores OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Semaphores a b c 3M 6M 9M 12M 15M 11840389.35 12234144.91 12232435.50 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
AOM AV1 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.6 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 1080p a b c 4 8 12 16 20 15.97 15.58 16.07 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
Stress-NG Test: Pthread OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Pthread a b c 20K 40K 60K 80K 100K 87858.71 87532.40 90272.05 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
AOM AV1 Encoder Mode: Speed 10 Realtime - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.6 Encoder Mode: Speed 10 Realtime - Input: Bosphorus 1080p a b c 7 14 21 28 35 30.04 30.21 30.98 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Parallel a b c 3 6 9 12 15 13.25 13.03 13.43 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
AOM AV1 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.6 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 1080p a b c 7 14 21 28 35 28.78 27.93 28.61 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
AOM AV1 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.6 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 4K a b c 0.0765 0.153 0.2295 0.306 0.3825 0.33 0.34 0.34 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
Zstd Compression Compression Level: 8, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8, Long Mode - Decompression Speed a b c 300 600 900 1200 1500 1161.6 1163.4 1129.3 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
FFmpeg Encoder: libx264 - Scenario: Video On Demand OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Video On Demand a b c 8 16 24 32 40 36.21 35.16 35.33 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx264 - Scenario: Video On Demand OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Video On Demand a b c 50 100 150 200 250 209.22 215.44 214.38 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU a b 0.4868 0.9736 1.4604 1.9472 2.434 SE +/- 0.02056, N = 3 2.10120 2.16346 MIN: 2.03 MIN: 2.03 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
VVenC Video Input: Bosphorus 1080p - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 1080p - Video Preset: Faster a b c 3 6 9 12 15 13.35 13.10 12.98 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel a b c 130 260 390 520 650 624.46 607.24 624.35 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.4 Encoder Mode: Preset 12 - Input: Bosphorus 4K a b c 30 60 90 120 150 116.89 119.69 120.06 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.15 Time To Compile a b c 20 40 60 80 100 93.00 94.97 92.48
Stress-NG Test: Crypto OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Crypto a b c 20K 40K 60K 80K 100K 105894.93 108459.26 107612.44 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Timed Node.js Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Node.js Compilation 19.8.1 Time To Compile a b c 50 100 150 200 250 203.68 202.83 207.61
FFmpeg Encoder: libx265 - Scenario: Live OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Live a b c 8 16 24 32 40 33.78 34.57 34.24 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Live OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Live a b c 30 60 90 120 150 149.48 146.07 147.51 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel a b c 8 16 24 32 40 32.68 32.68 31.93 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.4 Encoder Mode: Preset 8 - Input: Bosphorus 4K a b c 9 18 27 36 45 37.74 38.18 37.33 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
FFmpeg Encoder: libx264 - Scenario: Live OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Live a b c 30 60 90 120 150 133.05 130.09 130.88 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx264 - Scenario: Live OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Live a b c 9 18 27 36 45 37.96 38.82 38.58 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Standard a b c 4 8 12 16 20 16.13 16.31 15.96 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
VVenC Video Input: Bosphorus 4K - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 4K - Video Preset: Fast a b c 0.7337 1.4674 2.2011 2.9348 3.6685 3.261 3.221 3.191 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
Zstd Compression Compression Level: 12 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 12 - Decompression Speed a b c 200 400 600 800 1000 1095.6 1075.5 1072.1 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
TensorFlow Device: CPU - Batch Size: 32 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 32 - Model: AlexNet a b c 40 80 120 160 200 178.52 174.81 175.31
Zstd Compression Compression Level: 19 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Compression Speed a b c 4 8 12 16 20 14.6 14.3 14.4 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
SPECFEM3D Model: Mount St. Helens OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Mount St. Helens a b c 4 8 12 16 20 13.74 13.46 13.57 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.4 Encoder Mode: Preset 4 - Input: Bosphorus 4K a b c 0.5344 1.0688 1.6032 2.1376 2.672 2.375 2.336 2.327 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Standard a b c 40 80 120 160 200 183.30 186.94 184.88 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
Stress-NG Test: Hash OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Hash a b c 3M 6M 9M 12M 15M 12949136.28 13204487.51 13066446.38 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
VP9 libvpx Encoding Speed: Speed 5 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better VP9 libvpx Encoding 1.13 Speed: Speed 5 - Input: Bosphorus 1080p a b c 3 6 9 12 15 11.42 11.53 11.31 1. (CXX) g++ options: -m64 -lm -lpthread -O3 -fPIC -U_FORTIFY_SOURCE -std=gnu++11
Stress-NG Test: SENDFILE OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: SENDFILE a b c 200K 400K 600K 800K 1000K 1143575.46 1162795.27 1165355.20 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Zstd Compression Compression Level: 19 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Decompression Speed a b c 200 400 600 800 1000 907.7 924.0 907.0 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
RocksDB Test: Random Read OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Random Read a b c 60M 120M 180M 240M 300M 273039094 268195000 273197200 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenSSL Algorithm: SHA512 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: SHA512 a b c 5000M 10000M 15000M 20000M 25000M 22617583540 22641175270 22228899690 1. (CC) gcc options: -pthread -m64 -O3 -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU a b 100 200 300 400 500 SE +/- 6.28, N = 3 455.60 447.36 MIN: 430.98 MIN: 435.41 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
RocksDB Test: Random Fill Sync OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Random Fill Sync a b c 16K 32K 48K 64K 80K 73440 74771 74573 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Parallel a b c 20 40 60 80 100 92.43 91.76 93.32 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU a b 0.0854 0.1708 0.2562 0.3416 0.427 SE +/- 0.003364, N = 3 0.373237 0.379540 MIN: 0.33 MIN: 0.33 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Stress-NG Test: Matrix Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Matrix Math a b c 70K 140K 210K 280K 350K 335705.12 341137.35 341249.75 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
FFmpeg Encoder: libx264 - Scenario: Platform OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Platform a b c 50 100 150 200 250 212.62 212.54 209.19 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx264 - Scenario: Upload OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Upload a b c 3 6 9 12 15 9.91 9.97 9.81 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx264 - Scenario: Platform OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Platform a b c 8 16 24 32 40 35.63 35.64 36.21 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
AOM AV1 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.6 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 4K a b c 1.1273 2.2546 3.3819 4.5092 5.6365 5.01 4.93 4.95 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard a b 50 100 150 200 250 219.83 216.36 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
Timed Godot Game Engine Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Godot Game Engine Compilation 4.0 Time To Compile a b c 30 60 90 120 150 148.77 148.85 146.51
FFmpeg Encoder: libx264 - Scenario: Upload OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Upload a b c 60 120 180 240 300 254.90 253.32 257.31 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel a b 5 10 15 20 25 22.29 21.95 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
Zstd Compression Compression Level: 3, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3, Long Mode - Decompression Speed a b c 300 600 900 1200 1500 1190.2 1172.3 1189.7 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
TensorFlow Device: CPU - Batch Size: 16 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: GoogLeNet a b c 16 32 48 64 80 73.81 72.81 72.70
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel a b 40 80 120 160 200 190.26 187.42 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
VVenC Video Input: Bosphorus 1080p - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 1080p - Video Preset: Fast a b c 2 4 6 8 10 7.998 8.014 8.119 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
TensorFlow Device: CPU - Batch Size: 16 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: ResNet-50 a b c 6 12 18 24 30 23.33 22.99 23.23
nginx Connections: 500 OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 500 a b c 50K 100K 150K 200K 250K 222945.56 226231.19 225697.84 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
dav1d Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better dav1d 1.1 Video Input: Summer Nature 4K a b c 15 30 45 60 75 66.17 65.21 65.70 1. (CC) gcc options: -pthread
SPECFEM3D Model: Homogeneous Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Homogeneous Halfspace a b c 4 8 12 16 20 18.03 18.18 18.29 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
AOM AV1 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.6 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 1080p a b c 7 14 21 28 35 29.39 29.81 29.56 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: yolov4 - Device: CPU - Executor: Parallel a b c 3 6 9 12 15 10.19 10.14 10.06 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer ISPC - Model: Asian Dragon a b c 20 40 60 80 100 103.14 104.48 104.28 MIN: 101.66 / MAX: 108.37 MIN: 103 / MAX: 108.93 MIN: 102.35 / MAX: 108.15
Zstd Compression Compression Level: 19, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Decompression Speed a b c 200 400 600 800 1000 917.2 928.7 920.9 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
dav1d Video Input: Chimera 1080p OpenBenchmarking.org FPS, More Is Better dav1d 1.1 Video Input: Chimera 1080p a b c 30 60 90 120 150 134.66 133.01 134.10 1. (CC) gcc options: -pthread
VP9 libvpx Encoding Speed: Speed 0 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better VP9 libvpx Encoding 1.13 Speed: Speed 0 - Input: Bosphorus 1080p a b c 1.2893 2.5786 3.8679 5.1572 6.4465 5.71 5.66 5.73 1. (CXX) g++ options: -m64 -lm -lpthread -O3 -fPIC -U_FORTIFY_SOURCE -std=gnu++11
TensorFlow Device: CPU - Batch Size: 32 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 32 - Model: GoogLeNet a b c 20 40 60 80 100 100.57 101.79 101.18
Stress-NG Test: Forking OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Forking a b c 14K 28K 42K 56K 70K 65824.56 66467.95 65696.65 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Stress-NG Test: NUMA OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: NUMA a b c 90 180 270 360 450 437.26 432.40 433.52 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU a b 100 200 300 400 500 SE +/- 6.11, N = 15 466.73 471.80 MIN: 406.85 MIN: 461.64 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
TensorFlow Device: CPU - Batch Size: 512 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 512 - Model: AlexNet a b c 110 220 330 440 550 518.73 513.22 514.40
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard a b c 3 6 9 12 15 9.37499 9.47539 9.40220 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard a b c 150 300 450 600 750 699.96 707.31 705.80 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
FFmpeg Encoder: libx265 - Scenario: Upload OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Upload a b c 2 4 6 8 10 8.62 8.66 8.71 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
TensorFlow Device: CPU - Batch Size: 256 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 256 - Model: ResNet-50 a b c 14 28 42 56 70 60.21 60.09 60.71
Stress-NG Test: MMAP OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: MMAP a b c 700 1400 2100 2800 3500 3124.36 3115.23 3092.95 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
FFmpeg Encoder: libx265 - Scenario: Upload OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Upload a b c 60 120 180 240 300 292.80 291.57 289.92 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU a b 0.3194 0.6388 0.9582 1.2776 1.597 SE +/- 0.00845, N = 3 1.41955 1.40563 MIN: 1.24 MIN: 1.16 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Apache HTTP Server Concurrent Requests: 500 OpenBenchmarking.org Requests Per Second, More Is Better Apache HTTP Server 2.4.56 Concurrent Requests: 500 a b 8K 16K 24K 32K 40K 37442.60 37076.95 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
dav1d Video Input: Summer Nature 1080p OpenBenchmarking.org FPS, More Is Better dav1d 1.1 Video Input: Summer Nature 1080p a b c 20 40 60 80 100 96.80 96.07 95.86 1. (CC) gcc options: -pthread
AOM AV1 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.6 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4K a b c 4 8 12 16 20 15.39 15.36 15.51 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
Google Draco Model: Church Facade OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Church Facade a b c 1500 3000 4500 6000 7500 6765 6702 6767 1. (CXX) g++ options: -O3
Timed LLVM Compilation Build System: Unix Makefiles OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 16.0 Build System: Unix Makefiles a b c 60 120 180 240 300 286.22 286.24 288.95
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU a b 0.2637 0.5274 0.7911 1.0548 1.3185 SE +/- 0.00262, N = 3 1.17192 1.16101 MIN: 0.97 MIN: 0.98 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.5 Blend File: BMW27 - Compute: CPU-Only a b c 6 12 18 24 30 24.20 24.36 24.42
MariaDB Clients: 1024 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.0.1 Clients: 1024 a b c 20 40 60 80 100 110 110 111 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O3 -shared -lrt -lpthread -lz -ldl -lm -lstdc++
MariaDB Clients: 2048 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.0.1 Clients: 2048 a b c 30 60 90 120 150 111 112 111 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O3 -shared -lrt -lpthread -lz -ldl -lm -lstdc++
OpenSSL Algorithm: AES-128-GCM OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: AES-128-GCM a b c 200000M 400000M 600000M 800000M 1000000M 799202790060 802646430240 806241823680 1. (CC) gcc options: -pthread -m64 -O3 -ldl
MariaDB Clients: 4096 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.0.1 Clients: 4096 a b c 30 60 90 120 150 115 115 114 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O3 -shared -lrt -lpthread -lz -ldl -lm -lstdc++
Embree Binary: Pathtracer - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer - Model: Crown a b c 16 32 48 64 80 71.27 71.83 71.89 MIN: 67.7 / MAX: 79.85 MIN: 68.18 / MAX: 79.7 MIN: 67.5 / MAX: 80.61
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer ISPC - Model: Crown a b c 20 40 60 80 100 88.00 88.57 87.85 MIN: 84.5 / MAX: 93.11 MIN: 84.55 / MAX: 93.3 MIN: 84.35 / MAX: 92.29
AOM AV1 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.6 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 1080p a b c 2 4 6 8 10 8.82 8.79 8.86 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
nginx Connections: 200 OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 200 a b c 50K 100K 150K 200K 250K 245504.32 244509.87 246434.82 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU a b 0.8852 1.7704 2.6556 3.5408 4.426 SE +/- 0.00278, N = 3 3.90427 3.93436 MIN: 3.68 MIN: 3.68 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Google Draco Model: Lion OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Lion a b c 1200 2400 3600 4800 6000 5502 5479 5521 1. (CXX) g++ options: -O3
GROMACS Implementation: MPI CPU - Input: water_GMX50_bare OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2023 Implementation: MPI CPU - Input: water_GMX50_bare a b c 3 6 9 12 15 9.268 9.199 9.259 1. (CXX) g++ options: -O3
TensorFlow Device: CPU - Batch Size: 512 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 512 - Model: ResNet-50 a b c 16 32 48 64 80 71.38 71.21 71.74
TensorFlow Device: CPU - Batch Size: 64 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 64 - Model: ResNet-50 a b c 9 18 27 36 45 38.90 38.63 38.71
VP9 libvpx Encoding Speed: Speed 0 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better VP9 libvpx Encoding 1.13 Speed: Speed 0 - Input: Bosphorus 4K a b c 0.6525 1.305 1.9575 2.61 3.2625 2.88 2.90 2.88 1. (CXX) g++ options: -m64 -lm -lpthread -O3 -fPIC -U_FORTIFY_SOURCE -std=gnu++11
Stress-NG Test: Poll OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Poll a b c 2M 4M 6M 8M 10M 9551392.98 9566416.75 9500903.35 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Blender Blend File: Fishy Cat - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.5 Blend File: Fishy Cat - Compute: CPU-Only a b c 7 14 21 28 35 32.22 32.29 32.07
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.5 Blend File: Classroom - Compute: CPU-Only a b c 14 28 42 56 70 63.07 62.71 63.14
FFmpeg Encoder: libx265 - Scenario: Video On Demand OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Video On Demand a b c 100 200 300 400 500 448.21 448.99 451.26 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
TensorFlow Device: CPU - Batch Size: 256 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 256 - Model: AlexNet a b c 90 180 270 360 450 408.72 407.51 410.27
FFmpeg Encoder: libx265 - Scenario: Video On Demand OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Video On Demand a b c 4 8 12 16 20 16.90 16.87 16.79 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Embree Binary: Pathtracer - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer - Model: Asian Dragon Obj a b c 20 40 60 80 100 76.17 76.66 76.41 MIN: 73.69 / MAX: 79.67 MIN: 74.78 / MAX: 80.24 MIN: 74.48 / MAX: 78.97
TensorFlow Device: CPU - Batch Size: 32 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 32 - Model: ResNet-50 a b c 7 14 21 28 35 30.87 30.80 31.00
Stress-NG Test: Mutex OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Mutex a b c 8M 16M 24M 32M 40M 36280764.76 36505304.60 36276764.04 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
VVenC Video Input: Bosphorus 4K - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 4K - Video Preset: Faster a b c 1.1315 2.263 3.3945 4.526 5.6575 5.029 4.998 5.012 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
Timed LLVM Compilation Build System: Ninja OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 16.0 Build System: Ninja a b c 40 80 120 160 200 180.77 181.17 180.06
Stress-NG Test: Malloc OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Malloc a b c 40M 80M 120M 160M 200M 190168476.52 191206280.20 190376822.58 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Zstd Compression Compression Level: 3 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3 - Decompression Speed a b c 200 400 600 800 1000 1128.4 1132.3 1126.3 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.5 Blend File: Barbershop - Compute: CPU-Only a b c 50 100 150 200 250 244.20 243.71 244.89
Blender Blend File: Pabellon Barcelona - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.5 Blend File: Pabellon Barcelona - Compute: CPU-Only a b c 20 40 60 80 100 76.92 77.29 77.15
Embree Binary: Pathtracer ISPC - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer ISPC - Model: Asian Dragon Obj a b c 20 40 60 80 100 89.18 89.29 88.89 MIN: 87.45 / MAX: 91.82 MIN: 87.48 / MAX: 93.27 MIN: 87.09 / MAX: 93.32
dav1d Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better dav1d 1.1 Video Input: Chimera 1080p 10-bit a b c 30 60 90 120 150 130.69 130.12 130.27 1. (CC) gcc options: -pthread
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.4 Encoder Mode: Preset 13 - Input: Bosphorus 4K a b c 20 40 60 80 100 97.16 97.19 97.57 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SPECFEM3D Model: Layered Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Layered Halfspace a b c 7 14 21 28 35 29.64 29.52 29.65 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Standard a b 40 80 120 160 200 196.25 197.02 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
TensorFlow Device: CPU - Batch Size: 256 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 256 - Model: GoogLeNet a b c 50 100 150 200 250 214.30 213.49 213.69
Embree Binary: Pathtracer - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer - Model: Asian Dragon a b c 20 40 60 80 100 84.35 84.28 84.59 MIN: 81.19 / MAX: 88.18 MIN: 81.06 / MAX: 88.82 MIN: 82.73 / MAX: 90.07
FFmpeg Encoder: libx265 - Scenario: Platform OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Platform a b c 100 200 300 400 500 448.63 448.26 449.86 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Platform OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Platform a b c 4 8 12 16 20 16.88 16.90 16.84 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
OpenSSL Algorithm: AES-256-GCM OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: AES-256-GCM a b c 150000M 300000M 450000M 600000M 750000M 709965487310 712463205990 710407167180 1. (CC) gcc options: -pthread -m64 -O3 -ldl
TensorFlow Device: CPU - Batch Size: 64 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 64 - Model: GoogLeNet a b c 30 60 90 120 150 132.43 132.26 131.98
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 6.0 Time To Compile a b c 4 8 12 16 20 17.81 17.87 17.84
Stress-NG Test: Memory Copying OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Memory Copying a b c 2K 4K 6K 8K 10K 10941.37 10922.24 10908.20 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
TensorFlow Device: CPU - Batch Size: 512 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 512 - Model: GoogLeNet a b c 60 120 180 240 300 257.44 258.17 258.13
RocksDB Test: Random Fill OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Random Fill a b c 20K 40K 60K 80K 100K 104159 104356 104086 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
Stress-NG Test: Glibc Qsort Data Sorting OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Glibc Qsort Data Sorting a b c 300 600 900 1200 1500 1550.53 1546.69 1547.20 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
OpenSSL Algorithm: RSA4096 OpenBenchmarking.org sign/s, More Is Better OpenSSL 3.1 Algorithm: RSA4096 a b c 8K 16K 24K 32K 40K 36965.0 36899.1 36977.5 1. (CC) gcc options: -pthread -m64 -O3 -ldl
OpenSSL Algorithm: RSA4096 OpenBenchmarking.org verify/s, More Is Better OpenSSL 3.1 Algorithm: RSA4096 a b c 300K 600K 900K 1200K 1500K 1189343.7 1186897.4 1188772.4 1. (CC) gcc options: -pthread -m64 -O3 -ldl
Stress-NG Test: IO_uring OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: IO_uring a b c 6K 12K 18K 24K 30K 26355.62 26364.63 26405.01 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
OpenSSL Algorithm: ChaCha20-Poly1305 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: ChaCha20-Poly1305 a b c 60000M 120000M 180000M 240000M 300000M 293957452870 294463651570 294001151180 1. (CC) gcc options: -pthread -m64 -O3 -ldl
Stress-NG Test: Vector Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Vector Math a b c 70K 140K 210K 280K 350K 306730.85 306299.59 306225.09 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
RocksDB Test: Sequential Fill OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Sequential Fill a b c 20K 40K 60K 80K 100K 106061 106147 106227 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Update Random OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Update Random a b c 20K 40K 60K 80K 100K 97772 97861 97722 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenSSL Algorithm: ChaCha20 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: ChaCha20 a b c 90000M 180000M 270000M 360000M 450000M 425091630970 425550559910 425195437530 1. (CC) gcc options: -pthread -m64 -O3 -ldl
Stress-NG Test: Context Switching OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Context Switching a b c 800K 1600K 2400K 3200K 4000K 3511158.35 3508321.29 3511599.94 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Stress-NG Test: x86_64 RdRand OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: x86_64 RdRand a b c 140K 280K 420K 560K 700K 658591.72 658461.44 658542.68 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
MariaDB Clients: 512 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.0.1 Clients: 512 a b c 20 40 60 80 100 110 110 110 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O3 -shared -lrt -lpthread -lz -ldl -lm -lstdc++
AOM AV1 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.6 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 1080p a b c 0.1643 0.3286 0.4929 0.6572 0.8215 0.73 0.73 0.73 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU a b 160 320 480 640 800 SE +/- 13.20, N = 13 708.53 737.56 MIN: 616.19 MIN: 712.91 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU a b 9 18 27 36 45 SE +/- 1.54, N = 15 35.46 39.80 MIN: 13.46 MIN: 26.16 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU a b 0.5195 1.039 1.5585 2.078 2.5975 SE +/- 0.04986, N = 15 2.30887 2.21121 MIN: 1.77 MIN: 1.77 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU a b 4 8 12 16 20 SE +/- 10.58609, N = 12 18.01146 5.37587 MIN: 3.07 MIN: 3.86 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU a b 7 14 21 28 35 SE +/- 8.531398, N = 12 30.271432 0.668506 MIN: 0.45 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU a b 3 6 9 12 15 SE +/- 0.35622, N = 15 2.56877 10.05550 MIN: 1.1 MIN: 3.3 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU a b 0.459 0.918 1.377 1.836 2.295 SE +/- 0.20405, N = 15 2.03996 1.80895 MIN: 1.44 MIN: 1.63 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU a b 0.3732 0.7464 1.1196 1.4928 1.866 SE +/- 0.05971, N = 15 1.46571 1.65880 MIN: 1.06 MIN: 1.3 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard a b 9 18 27 36 45 37.67 31.20 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel a b 10 20 30 40 50 44.86 45.56 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Standard a b 1.1464 2.2928 3.4392 4.5856 5.732 5.09499 5.07500 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Parallel a b 4 8 12 16 20 16.18 17.00 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard a b 1.0397 2.0794 3.1191 4.1588 5.1985 4.54812 4.62093 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel a b 1.2002 2.4004 3.6006 4.8008 6.001 5.25466 5.33429 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard a b c 7 14 21 28 35 27.62 29.69 26.98 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel a b c 7 14 21 28 35 30.60 30.60 31.31 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard a b c 20 40 60 80 100 106.66 105.53 106.35 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel a b c 70 140 210 280 350 320.96 341.19 340.99 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard a b c 0.3211 0.6422 0.9633 1.2844 1.6055 1.42719 1.41212 1.41518 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel a b c 0.3701 0.7402 1.1103 1.4804 1.8505 1.59956 1.64503 1.59992 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Standard a b c 14 28 42 56 70 62.01 61.29 62.65 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Parallel a b c 20 40 60 80 100 75.44 76.75 74.46 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: yolov4 - Device: CPU - Executor: Standard a b c 20 40 60 80 100 87.99 84.05 85.51 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: yolov4 - Device: CPU - Executor: Parallel a b c 20 40 60 80 100 98.12 98.61 99.41 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Standard a b c 1.2269 2.4538 3.6807 4.9076 6.1345 5.45281 5.34648 5.40610 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Parallel a b c 3 6 9 12 15 10.81 10.89 10.71 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
Phoronix Test Suite v10.8.5