xeon eo march Tests for a future article. 2 x Intel Xeon Platinum 8380 testing with a Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS) and ASPEED on Ubuntu 22.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2304013-NE-XEONEOMAR35&sor&grr .
xeon eo march Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Vulkan Compiler File-System Screen Resolution a b c 2 x Intel Xeon Platinum 8380 @ 3.40GHz (80 Cores / 160 Threads) Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS) Intel Ice Lake IEH 512GB 7682GB INTEL SSDPF2KX076TZ ASPEED VE228 2 x Intel X710 for 10GBASE-T + 2 x Intel E810-C for QSFP Ubuntu 22.10 6.2.0-rc5-phx-dodt (x86_64) GNOME Shell 43.0 X Server 1.21.1.3 1.3.224 GCC 12.2.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0xd000375 Python Details - Python 3.10.7 Security Details - dodt: Mitigation of DOITM + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected
xeon eo march rocksdb: Seq Fill mysqlslap: 512 mysqlslap: 1024 mysqlslap: 2048 mysqlslap: 4096 tensorflow: CPU - 512 - ResNet-50 onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU opencv: Graph API onednn: Recurrent Neural Network Training - f32 - CPU opencv: Stitching ffmpeg: libx265 - Video On Demand ffmpeg: libx265 - Video On Demand ffmpeg: libx265 - Platform ffmpeg: libx265 - Platform tensorflow: CPU - 256 - ResNet-50 ffmpeg: libx265 - Upload ffmpeg: libx265 - Upload ffmpeg: libx264 - Upload ffmpeg: libx264 - Upload opencv: Image Processing opencv: Features 2D build-llvm: Unix Makefiles blender: Barbershop - CPU-Only ffmpeg: libx264 - Video On Demand ffmpeg: libx264 - Video On Demand ffmpeg: libx264 - Platform ffmpeg: libx264 - Platform tensorflow: CPU - 512 - GoogLeNet vpxenc: Speed 0 - Bosphorus 4K build-nodejs: Time To Compile tensorflow: CPU - 64 - ResNet-50 vvenc: Bosphorus 4K - Fast ffmpeg: libx265 - Live ffmpeg: libx265 - Live build-llvm: Ninja openssl: ChaCha20-Poly1305 openssl: AES-256-GCM openssl: AES-128-GCM openssl: ChaCha20 openssl: SHA256 openssl: SHA512 onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU aom-av1: Speed 4 Two-Pass - Bosphorus 4K build-godot: Time To Compile tensorflow: CPU - 256 - GoogLeNet opencv: Video vvenc: Bosphorus 4K - Faster tensorflow: CPU - 32 - ResNet-50 onednn: IP Shapes 1D - f32 - CPU vpxenc: Speed 5 - Bosphorus 4K onednn: IP Shapes 1D - u8s8f32 - CPU tensorflow: CPU - 512 - AlexNet opencv: Object Detection opencv: DNN - Deep Neural Network vpxenc: Speed 0 - Bosphorus 1080p aom-av1: Speed 6 Two-Pass - Bosphorus 4K onednn: IP Shapes 1D - bf16bf16bf16 - CPU build2: Time To Compile onnx: GPT-2 - CPU - Parallel onnx: GPT-2 - CPU - Parallel onnx: GPT-2 - CPU - Standard onnx: GPT-2 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Parallel onnx: fcn-resnet101-11 - CPU - Parallel onnx: yolov4 - CPU - Parallel onnx: yolov4 - CPU - Parallel onnx: bertsquad-12 - CPU - Parallel onnx: bertsquad-12 - CPU - Parallel onnx: bertsquad-12 - CPU - Standard onnx: bertsquad-12 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Parallel onnx: ArcFace ResNet-100 - CPU - Parallel onnx: ArcFace ResNet-100 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Standard onnx: yolov4 - CPU - Standard onnx: yolov4 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Standard onnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallel onnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallel onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard nginx: 500 nginx: 200 onnx: CaffeNet 12-int8 - CPU - Parallel onnx: CaffeNet 12-int8 - CPU - Parallel apache: 500 onnx: CaffeNet 12-int8 - CPU - Standard onnx: CaffeNet 12-int8 - CPU - Standard apache: 200 onnx: ResNet50 v1-12-int8 - CPU - Parallel onnx: ResNet50 v1-12-int8 - CPU - Parallel onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: super-resolution-10 - CPU - Parallel onnx: super-resolution-10 - CPU - Parallel onnx: super-resolution-10 - CPU - Standard onnx: super-resolution-10 - CPU - Standard opencv: Core aom-av1: Speed 4 Two-Pass - Bosphorus 1080p tensorflow: CPU - 16 - ResNet-50 daphne: OpenMP - Points2Image compress-zstd: 19, Long Mode - Decompression Speed compress-zstd: 19, Long Mode - Compression Speed vvenc: Bosphorus 1080p - Fast blender: Pabellon Barcelona - CPU-Only svt-av1: Preset 4 - Bosphorus 4K onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 3D - bf16bf16bf16 - CPU tensorflow: CPU - 256 - AlexNet ffmpeg: libx264 - Live ffmpeg: libx264 - Live dav1d: Chimera 1080p 10-bit memcached: 1:100 dav1d: Chimera 1080p memcached: 1:10 memcached: 1:5 compress-zstd: 8, Long Mode - Decompression Speed compress-zstd: 8, Long Mode - Compression Speed compress-zstd: 3, Long Mode - Decompression Speed compress-zstd: 3, Long Mode - Compression Speed compress-zstd: 3 - Decompression Speed compress-zstd: 3 - Compression Speed compress-zstd: 8 - Decompression Speed compress-zstd: 8 - Compression Speed blender: Classroom - CPU-Only compress-zstd: 12 - Decompression Speed compress-zstd: 12 - Compression Speed compress-zstd: 19 - Decompression Speed compress-zstd: 19 - Compression Speed rocksdb: Rand Fill Sync aom-av1: Speed 0 Two-Pass - Bosphorus 4K rocksdb: Rand Fill onednn: IP Shapes 3D - u8s8f32 - CPU john-the-ripper: MD5 rocksdb: Update Rand rocksdb: Read Rand Write Rand john-the-ripper: HMAC-SHA512 rocksdb: Read While Writing rocksdb: Rand Read openssl: RSA4096 openssl: RSA4096 tensorflow: CPU - 64 - GoogLeNet dav1d: Summer Nature 4K vpxenc: Speed 5 - Bosphorus 1080p aom-av1: Speed 6 Two-Pass - Bosphorus 1080p vvenc: Bosphorus 1080p - Faster onednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU aom-av1: Speed 8 Realtime - Bosphorus 4K aom-av1: Speed 6 Realtime - Bosphorus 4K tensorflow: CPU - 32 - GoogLeNet aom-av1: Speed 10 Realtime - Bosphorus 4K aom-av1: Speed 9 Realtime - Bosphorus 4K dav1d: Summer Nature 1080p specfem3d: Water-layered Halfspace tensorflow: CPU - 64 - AlexNet specfem3d: Layered Halfspace stress-ng: Atomic blender: Fishy Cat - CPU-Only stress-ng: Futex stress-ng: Pthread stress-ng: Zlib stress-ng: Socket Activity stress-ng: Memory Copying john-the-ripper: WPA PSK stress-ng: IO_uring john-the-ripper: bcrypt john-the-ripper: Blowfish stress-ng: CPU Cache stress-ng: Malloc stress-ng: System V Message Passing stress-ng: MEMFD stress-ng: NUMA stress-ng: Poll stress-ng: Glibc C String Functions stress-ng: Semaphores stress-ng: Vector Math stress-ng: Matrix Math stress-ng: CPU Stress stress-ng: Hash stress-ng: SENDFILE stress-ng: Function Call stress-ng: Mutex stress-ng: Forking stress-ng: Crypto stress-ng: MMAP stress-ng: x86_64 RdRand stress-ng: Glibc Qsort Data Sorting stress-ng: Context Switching tensorflow: CPU - 16 - GoogLeNet gromacs: MPI CPU - water_GMX50_bare aom-av1: Speed 0 Two-Pass - Bosphorus 1080p daphne: OpenMP - Euclidean Cluster daphne: OpenMP - NDT Mapping blender: BMW27 - CPU-Only svt-av1: Preset 8 - Bosphorus 4K tensorflow: CPU - 32 - AlexNet onednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU aom-av1: Speed 6 Realtime - Bosphorus 1080p specfem3d: Homogeneous Halfspace aom-av1: Speed 8 Realtime - Bosphorus 1080p aom-av1: Speed 10 Realtime - Bosphorus 1080p embree: Pathtracer - Asian Dragon Obj aom-av1: Speed 9 Realtime - Bosphorus 1080p embree: Pathtracer ISPC - Asian Dragon Obj specfem3d: Tomographic Model onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU build-ffmpeg: Time To Compile specfem3d: Mount St. Helens tensorflow: CPU - 16 - AlexNet onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU embree: Pathtracer - Crown svt-av1: Preset 12 - Bosphorus 4K svt-av1: Preset 13 - Bosphorus 4K embree: Pathtracer - Asian Dragon draco: Church Facade embree: Pathtracer ISPC - Crown draco: Lion embree: Pathtracer ISPC - Asian Dragon nginx: 100 a b c 106061 110 110 111 115 71.38 705.678 466.729 671086 708.530 508105 16.90 448.20691755 16.88 448.633638741 60.21 8.62 292.801657432 9.91 254.898947776 356989 270872 286.217 244.2 36.21 209.220285516 35.63 212.621669857 257.44 2.88 203.675 38.9 3.261 33.78 149.480929923 180.771 293957452870 709965487310 799202790060 425091630970 57611766850 22617583540 35.4599 711.380 453.911 455.595 5.01 148.771 214.3 121173 5.029 30.87 1.46571 5.2 2.56877 518.73 119494 115385 5.71 8.16 18.01146 93.004 10.8107 92.431 5.45281 183.295 320.962 3.1156 98.1183 10.1914 75.4412 13.2549 62.0109 16.1257 30.5988 32.6794 27.6247 36.1961 87.992 11.3641 106.662 9.37499 44.8631 22.2889 37.6739 26.5412 222945.56 245504.32 1.59956 624.463 37442.6 1.42719 699.963 34680.57 5.25466 190.263 4.54812 219.832 16.1828 61.7891 5.09499 196.246 88425 8.82 23.33 9474.39 917.2 7.88 7.998 76.92 2.375 2.03996 2.30887 408.72 133.05 37.96 130.69 1970590.49 134.66 2137824.96 1774418.86 1161.6 295 1190.2 258 1128.4 2013.1 1128.8 624 63.07 1095.6 161.4 907.7 14.6 73440 0.33 104159 30.271432 10384000 97772 920426 151993000 8059987 273039094 1189343.7 36965 132.43 66.17 11.42 15.97 13.353 3.90427 0.373237 15.39 16.15 100.57 15.82 16.04 96.8 31.466335812 236.08 29.635032517 139.8 32.22 957096.2 87858.71 6374.22 94302.89 10941.37 483328 26355.62 114777 114499 1441584.05 190168476.52 10295375.81 766.9 437.26 9551392.98 60467587.4 11840389.35 306730.85 335705.12 152655.29 12949136.28 1143575.46 429335.83 36280764.76 65824.56 105894.93 3124.36 658591.72 1550.53 3511158.35 73.81 9.268 0.73 842.32 817.52 24.2 37.744 178.52 3.67056 28.78 18.02764317 29.39 30.04 76.1659 31.7 89.1801 15.298794168 0.201235 17.808 13.743206283 138.07 1.41955 1.17192 2.10120 0.875338 71.2659 116.885 97.161 84.3522 6765 87.9967 5502 103.136 106147 110 110 112 115 71.21 656.795 471.804 597049 737.563 489265 16.87 448.99 16.90 448.26 60.09 8.66 291.569506705 9.97 253.32 293481 356993 286.237 243.71 35.16 215.44 35.64 212.54 258.17 2.9 202.826 38.63 3.221 34.57 146.069290258 181.169 294463651570 712463205990 802646430240 425550559910 54810975320 22641175270 39.8033 752.425 437.395 447.363 4.93 148.85 213.49 133100 4.998 30.8 1.6588 5.04 10.0555 513.22 103986 107827 5.66 8.4 5.37587 94.969 10.8897 91.7637 5.34648 186.943 341.193 2.93086 98.6054 10.1411 76.7518 13.0285 61.2937 16.3145 30.5984 32.6796 29.689 33.6787 84.0506 11.8969 105.532 9.47539 45.5602 21.9478 31.2002 32.0474 226231.19 244509.87 1.64503 607.235 37076.95 1.41212 707.306 38092.88 5.33429 187.423 4.62093 216.364 16.9976 58.8271 5.075 197.018 83121 8.79 22.99 9912.90 928.7 7.25 8.014 77.29 2.336 1.80895 2.21121 407.51 130.09 38.82 130.12 1946110.38 133.01 1736416.84 2133650.03 1163.4 278.4 1172.3 313.7 1132.3 2185 1140.4 603 62.71 1075.5 180.6 924 14.3 74771 0.34 104356 0.668506 10295000 97861 959340 152981000 7531197 268195000 1186897.4 36899.1 132.26 65.21 11.53 15.58 13.1 3.93436 0.37954 15.36 15.53 101.79 16.43 16.56 96.07 30.372995715 228.16 29.523601764 163.26 32.29 1018129.24 87532.4 5549.21 79477.34 10922.24 482323 26364.63 114825 107388 1691720.8 191206280.2 10283641.13 583.83 432.4 9566416.75 59352709.91 12234144.91 306299.59 341137.35 152455.93 13204487.51 1162795.27 413303.53 36505304.6 66467.95 108459.26 3115.23 658461.44 1546.69 3508321.29 72.81 9.199 0.73 796.63 771.26 24.36 38.178 174.81 4.16467 27.93 18.182305176 29.81 30.21 76.6639 31.1 89.2864 14.714132502 0.21833 17.865 13.461312471 145.79 1.40563 1.16101 2.16346 0.925478 71.8254 119.691 97.19 84.2763 6702 88.571 5479 104.4793 106227 110 111 111 114 71.74 16.79 451.26 16.84 449.86 60.71 8.71 289.92 9.81 257.31 288.948 244.89 35.33 214.38 36.21 209.19 258.13 2.88 207.61 38.71 3.191 34.24 147.51 180.059 294001151180 710407167180 806241823680 425195437530 57348072400 22228899690 4.95 146.51 213.69 5.012 31 5.02 514.4 5.73 7.99 92.475 10.7076 93.3205 5.4061 184.883 340.994 2.93257 99.4055 10.0594 74.4648 13.4286 62.6519 15.9607 31.3122 31.9347 26.9844 37.0538 85.5088 11.6941 106.353 9.4022 225697.84 246434.82 1.59992 624.35 1.41518 705.795 8.86 23.23 8791.23 920.9 6.95 8.119 77.15 2.327 410.27 130.88 38.58 130.27 1527158.77 134.1 2222180.6 1820534.68 1129.3 285.6 1189.7 267.4 1126.3 2209.8 831.7 638.2 63.14 1072.1 169.9 907 14.4 74573 0.34 104086 2766000 97722 954852 19292000 8192793 273197200 1188772.4 36977.5 131.98 65.7 11.31 16.07 12.977 15.51 16.33 101.18 16.33 16.63 95.86 30.958272243 227.64 29.647600618 349.89 32.07 1056011.64 90272.05 6511.98 50201.57 10908.2 255829 26405.01 64015 51517 1657780.82 190376822.58 73426960.15 583.2 433.52 9500903.35 58278938.78 12232435.5 306225.09 341249.75 157803.19 13066446.38 1165355.2 431183.83 36276764.04 65696.65 107612.44 3092.95 658542.68 1547.2 3511599.94 72.7 9.259 0.73 848.69 892.13 24.42 37.328 175.31 28.61 18.287061065 29.56 30.98 76.4083 30.01 88.8879 14.958818098 17.837 13.567236154 145.45 71.888 120.064 97.574 84.5925 6767 87.85 5521 104.2777 OpenBenchmarking.org
RocksDB Test: Sequential Fill OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Sequential Fill c b a 20K 40K 60K 80K 100K 106227 106147 106061 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
MariaDB Clients: 512 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.0.1 Clients: 512 c b a 20 40 60 80 100 110 110 110 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O3 -shared -lrt -lpthread -lz -ldl -lm -lstdc++
MariaDB Clients: 1024 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.0.1 Clients: 1024 c b a 20 40 60 80 100 111 110 110 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O3 -shared -lrt -lpthread -lz -ldl -lm -lstdc++
MariaDB Clients: 2048 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.0.1 Clients: 2048 b c a 30 60 90 120 150 112 111 111 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O3 -shared -lrt -lpthread -lz -ldl -lm -lstdc++
MariaDB Clients: 4096 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.0.1 Clients: 4096 b a c 30 60 90 120 150 115 115 114 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O3 -shared -lrt -lpthread -lz -ldl -lm -lstdc++
TensorFlow Device: CPU - Batch Size: 512 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 512 - Model: ResNet-50 c a b 16 32 48 64 80 71.74 71.38 71.21
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU b a 150 300 450 600 750 SE +/- 9.93, N = 15 656.80 705.68 MIN: 632.15 MIN: 626 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU a b 100 200 300 400 500 SE +/- 6.11, N = 15 466.73 471.80 MIN: 406.85 MIN: 461.64 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenCV Test: Graph API OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Graph API b a 140K 280K 420K 560K 700K 597049 671086 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU a b 160 320 480 640 800 SE +/- 13.20, N = 13 708.53 737.56 MIN: 616.19 MIN: 712.91 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenCV Test: Stitching OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Stitching b a 110K 220K 330K 440K 550K 489265 508105 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
FFmpeg Encoder: libx265 - Scenario: Video On Demand OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Video On Demand a b c 4 8 12 16 20 16.90 16.87 16.79 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Video On Demand OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Video On Demand a b c 100 200 300 400 500 448.21 448.99 451.26 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Platform OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Platform b a c 4 8 12 16 20 16.90 16.88 16.84 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Platform OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Platform b a c 100 200 300 400 500 448.26 448.63 449.86 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
TensorFlow Device: CPU - Batch Size: 256 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 256 - Model: ResNet-50 c a b 14 28 42 56 70 60.71 60.21 60.09
FFmpeg Encoder: libx265 - Scenario: Upload OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Upload c b a 2 4 6 8 10 8.71 8.66 8.62 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Upload OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Upload c b a 60 120 180 240 300 289.92 291.57 292.80 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx264 - Scenario: Upload OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Upload b a c 3 6 9 12 15 9.97 9.91 9.81 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx264 - Scenario: Upload OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Upload b a c 60 120 180 240 300 253.32 254.90 257.31 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
OpenCV Test: Image Processing OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Image Processing b a 80K 160K 240K 320K 400K 293481 356989 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenCV Test: Features 2D OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Features 2D a b 80K 160K 240K 320K 400K 270872 356993 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
Timed LLVM Compilation Build System: Unix Makefiles OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 16.0 Build System: Unix Makefiles a b c 60 120 180 240 300 286.22 286.24 288.95
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.5 Blend File: Barbershop - Compute: CPU-Only b a c 50 100 150 200 250 243.71 244.20 244.89
FFmpeg Encoder: libx264 - Scenario: Video On Demand OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Video On Demand a c b 8 16 24 32 40 36.21 35.33 35.16 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx264 - Scenario: Video On Demand OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Video On Demand a c b 50 100 150 200 250 209.22 214.38 215.44 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx264 - Scenario: Platform OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Platform c b a 8 16 24 32 40 36.21 35.64 35.63 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx264 - Scenario: Platform OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Platform c b a 50 100 150 200 250 209.19 212.54 212.62 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
TensorFlow Device: CPU - Batch Size: 512 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 512 - Model: GoogLeNet b c a 60 120 180 240 300 258.17 258.13 257.44
VP9 libvpx Encoding Speed: Speed 0 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better VP9 libvpx Encoding 1.13 Speed: Speed 0 - Input: Bosphorus 4K b c a 0.6525 1.305 1.9575 2.61 3.2625 2.90 2.88 2.88 1. (CXX) g++ options: -m64 -lm -lpthread -O3 -fPIC -U_FORTIFY_SOURCE -std=gnu++11
Timed Node.js Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Node.js Compilation 19.8.1 Time To Compile b a c 50 100 150 200 250 202.83 203.68 207.61
TensorFlow Device: CPU - Batch Size: 64 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 64 - Model: ResNet-50 a c b 9 18 27 36 45 38.90 38.71 38.63
VVenC Video Input: Bosphorus 4K - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 4K - Video Preset: Fast a b c 0.7337 1.4674 2.2011 2.9348 3.6685 3.261 3.221 3.191 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
FFmpeg Encoder: libx265 - Scenario: Live OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Live b c a 8 16 24 32 40 34.57 34.24 33.78 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Live OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Live b c a 30 60 90 120 150 146.07 147.51 149.48 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Timed LLVM Compilation Build System: Ninja OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 16.0 Build System: Ninja c a b 40 80 120 160 200 180.06 180.77 181.17
OpenSSL Algorithm: ChaCha20-Poly1305 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: ChaCha20-Poly1305 b c a 60000M 120000M 180000M 240000M 300000M 294463651570 294001151180 293957452870 1. (CC) gcc options: -pthread -m64 -O3 -ldl
OpenSSL Algorithm: AES-256-GCM OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: AES-256-GCM b c a 150000M 300000M 450000M 600000M 750000M 712463205990 710407167180 709965487310 1. (CC) gcc options: -pthread -m64 -O3 -ldl
OpenSSL Algorithm: AES-128-GCM OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: AES-128-GCM c b a 200000M 400000M 600000M 800000M 1000000M 806241823680 802646430240 799202790060 1. (CC) gcc options: -pthread -m64 -O3 -ldl
OpenSSL Algorithm: ChaCha20 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: ChaCha20 b c a 90000M 180000M 270000M 360000M 450000M 425550559910 425195437530 425091630970 1. (CC) gcc options: -pthread -m64 -O3 -ldl
OpenSSL Algorithm: SHA256 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: SHA256 a c b 12000M 24000M 36000M 48000M 60000M 57611766850 57348072400 54810975320 1. (CC) gcc options: -pthread -m64 -O3 -ldl
OpenSSL Algorithm: SHA512 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: SHA512 b a c 5000M 10000M 15000M 20000M 25000M 22641175270 22617583540 22228899690 1. (CC) gcc options: -pthread -m64 -O3 -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU a b 9 18 27 36 45 SE +/- 1.54, N = 15 35.46 39.80 MIN: 13.46 MIN: 26.16 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU a b 160 320 480 640 800 SE +/- 2.91, N = 3 711.38 752.43 MIN: 680.93 MIN: 723.13 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU b a 100 200 300 400 500 SE +/- 2.89, N = 3 437.40 453.91 MIN: 425.41 MIN: 437.57 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU b a 100 200 300 400 500 SE +/- 6.28, N = 3 447.36 455.60 MIN: 435.41 MIN: 430.98 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
AOM AV1 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.6 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 4K a c b 1.1273 2.2546 3.3819 4.5092 5.6365 5.01 4.95 4.93 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
Timed Godot Game Engine Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Godot Game Engine Compilation 4.0 Time To Compile c a b 30 60 90 120 150 146.51 148.77 148.85
TensorFlow Device: CPU - Batch Size: 256 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 256 - Model: GoogLeNet a c b 50 100 150 200 250 214.30 213.69 213.49
OpenCV Test: Video OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Video a b 30K 60K 90K 120K 150K 121173 133100 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
VVenC Video Input: Bosphorus 4K - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 4K - Video Preset: Faster a c b 1.1315 2.263 3.3945 4.526 5.6575 5.029 5.012 4.998 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
TensorFlow Device: CPU - Batch Size: 32 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 32 - Model: ResNet-50 c a b 7 14 21 28 35 31.00 30.87 30.80
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU a b 0.3732 0.7464 1.1196 1.4928 1.866 SE +/- 0.05971, N = 15 1.46571 1.65880 MIN: 1.06 MIN: 1.3 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
VP9 libvpx Encoding Speed: Speed 5 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better VP9 libvpx Encoding 1.13 Speed: Speed 5 - Input: Bosphorus 4K a b c 1.17 2.34 3.51 4.68 5.85 5.20 5.04 5.02 1. (CXX) g++ options: -m64 -lm -lpthread -O3 -fPIC -U_FORTIFY_SOURCE -std=gnu++11
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU a b 3 6 9 12 15 SE +/- 0.35622, N = 15 2.56877 10.05550 MIN: 1.1 MIN: 3.3 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
TensorFlow Device: CPU - Batch Size: 512 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 512 - Model: AlexNet a c b 110 220 330 440 550 518.73 514.40 513.22
OpenCV Test: Object Detection OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Object Detection b a 30K 60K 90K 120K 150K 103986 119494 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenCV Test: DNN - Deep Neural Network OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: DNN - Deep Neural Network b a 20K 40K 60K 80K 100K 107827 115385 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
VP9 libvpx Encoding Speed: Speed 0 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better VP9 libvpx Encoding 1.13 Speed: Speed 0 - Input: Bosphorus 1080p c a b 1.2893 2.5786 3.8679 5.1572 6.4465 5.73 5.71 5.66 1. (CXX) g++ options: -m64 -lm -lpthread -O3 -fPIC -U_FORTIFY_SOURCE -std=gnu++11
AOM AV1 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.6 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 4K b a c 2 4 6 8 10 8.40 8.16 7.99 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
oneDNN Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU b a 4 8 12 16 20 SE +/- 10.58609, N = 12 5.37587 18.01146 MIN: 3.86 MIN: 3.07 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.15 Time To Compile c a b 20 40 60 80 100 92.48 93.00 94.97
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Parallel c a b 3 6 9 12 15 10.71 10.81 10.89 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Parallel c a b 20 40 60 80 100 93.32 92.43 91.76 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Standard b c a 1.2269 2.4538 3.6807 4.9076 6.1345 5.34648 5.40610 5.45281 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Standard b c a 40 80 120 160 200 186.94 184.88 183.30 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel a c b 70 140 210 280 350 320.96 340.99 341.19 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel a c b 0.701 1.402 2.103 2.804 3.505 3.11560 2.93257 2.93086 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: yolov4 - Device: CPU - Executor: Parallel a b c 20 40 60 80 100 98.12 98.61 99.41 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: yolov4 - Device: CPU - Executor: Parallel a b c 3 6 9 12 15 10.19 10.14 10.06 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Parallel c a b 20 40 60 80 100 74.46 75.44 76.75 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Parallel c a b 3 6 9 12 15 13.43 13.25 13.03 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Standard b a c 14 28 42 56 70 61.29 62.01 62.65 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Standard b a c 4 8 12 16 20 16.31 16.13 15.96 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel b a c 7 14 21 28 35 30.60 30.60 31.31 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel b a c 8 16 24 32 40 32.68 32.68 31.93 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard c a b 7 14 21 28 35 26.98 27.62 29.69 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard c a b 9 18 27 36 45 37.05 36.20 33.68 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: yolov4 - Device: CPU - Executor: Standard b c a 20 40 60 80 100 84.05 85.51 87.99 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: yolov4 - Device: CPU - Executor: Standard b c a 3 6 9 12 15 11.90 11.69 11.36 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard b c a 20 40 60 80 100 105.53 106.35 106.66 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard b c a 3 6 9 12 15 9.47539 9.40220 9.37499 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel a b 10 20 30 40 50 44.86 45.56 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel a b 5 10 15 20 25 22.29 21.95 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard b a 9 18 27 36 45 31.20 37.67 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard b a 7 14 21 28 35 32.05 26.54 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
nginx Connections: 500 OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 500 b c a 50K 100K 150K 200K 250K 226231.19 225697.84 222945.56 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
nginx Connections: 200 OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 200 c a b 50K 100K 150K 200K 250K 246434.82 245504.32 244509.87 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel a c b 0.3701 0.7402 1.1103 1.4804 1.8505 1.59956 1.59992 1.64503 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel a c b 130 260 390 520 650 624.46 624.35 607.24 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
Apache HTTP Server Concurrent Requests: 500 OpenBenchmarking.org Requests Per Second, More Is Better Apache HTTP Server 2.4.56 Concurrent Requests: 500 a b 8K 16K 24K 32K 40K 37442.60 37076.95 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard b c a 0.3211 0.6422 0.9633 1.2844 1.6055 1.41212 1.41518 1.42719 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard b c a 150 300 450 600 750 707.31 705.80 699.96 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
Apache HTTP Server Concurrent Requests: 200 OpenBenchmarking.org Requests Per Second, More Is Better Apache HTTP Server 2.4.56 Concurrent Requests: 200 b a 8K 16K 24K 32K 40K 38092.88 34680.57 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel a b 1.2002 2.4004 3.6006 4.8008 6.001 5.25466 5.33429 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel a b 40 80 120 160 200 190.26 187.42 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard a b 1.0397 2.0794 3.1191 4.1588 5.1985 4.54812 4.62093 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard a b 50 100 150 200 250 219.83 216.36 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Parallel a b 4 8 12 16 20 16.18 17.00 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Parallel a b 14 28 42 56 70 61.79 58.83 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Standard b a 1.1464 2.2928 3.4392 4.5856 5.732 5.07500 5.09499 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Standard b a 40 80 120 160 200 197.02 196.25 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenCV Test: Core OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Core b a 20K 40K 60K 80K 100K 83121 88425 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
AOM AV1 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.6 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 1080p c a b 2 4 6 8 10 8.86 8.82 8.79 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
TensorFlow Device: CPU - Batch Size: 16 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: ResNet-50 a c b 6 12 18 24 30 23.33 23.23 22.99
Darmstadt Automotive Parallel Heterogeneous Suite Backend: OpenMP - Kernel: Points2Image OpenBenchmarking.org Test Cases Per Minute, More Is Better Darmstadt Automotive Parallel Heterogeneous Suite 2021.11.02 Backend: OpenMP - Kernel: Points2Image b a c 2K 4K 6K 8K 10K 9912.90 9474.39 8791.23 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp
Zstd Compression Compression Level: 19, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Decompression Speed b c a 200 400 600 800 1000 928.7 920.9 917.2 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 19, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Compression Speed a b c 2 4 6 8 10 7.88 7.25 6.95 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
VVenC Video Input: Bosphorus 1080p - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 1080p - Video Preset: Fast c b a 2 4 6 8 10 8.119 8.014 7.998 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
Blender Blend File: Pabellon Barcelona - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.5 Blend File: Pabellon Barcelona - Compute: CPU-Only a c b 20 40 60 80 100 76.92 77.15 77.29
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.4 Encoder Mode: Preset 4 - Input: Bosphorus 4K a b c 0.5344 1.0688 1.6032 2.1376 2.672 2.375 2.336 2.327 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU b a 0.459 0.918 1.377 1.836 2.295 SE +/- 0.20405, N = 15 1.80895 2.03996 MIN: 1.63 MIN: 1.44 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU b a 0.5195 1.039 1.5585 2.078 2.5975 SE +/- 0.04986, N = 15 2.21121 2.30887 MIN: 1.77 MIN: 1.77 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
TensorFlow Device: CPU - Batch Size: 256 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 256 - Model: AlexNet c a b 90 180 270 360 450 410.27 408.72 407.51
FFmpeg Encoder: libx264 - Scenario: Live OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Live a c b 30 60 90 120 150 133.05 130.88 130.09 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx264 - Scenario: Live OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Live a c b 9 18 27 36 45 37.96 38.58 38.82 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
dav1d Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better dav1d 1.1 Video Input: Chimera 1080p 10-bit a c b 30 60 90 120 150 130.69 130.27 130.12 1. (CC) gcc options: -pthread
Memcached Set To Get Ratio: 1:100 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.19 Set To Get Ratio: 1:100 a b c 400K 800K 1200K 1600K 2000K 1970590.49 1946110.38 1527158.77 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
dav1d Video Input: Chimera 1080p OpenBenchmarking.org FPS, More Is Better dav1d 1.1 Video Input: Chimera 1080p a c b 30 60 90 120 150 134.66 134.10 133.01 1. (CC) gcc options: -pthread
Memcached Set To Get Ratio: 1:10 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.19 Set To Get Ratio: 1:10 c a b 500K 1000K 1500K 2000K 2500K 2222180.60 2137824.96 1736416.84 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
Memcached Set To Get Ratio: 1:5 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.19 Set To Get Ratio: 1:5 b c a 500K 1000K 1500K 2000K 2500K 2133650.03 1820534.68 1774418.86 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
Zstd Compression Compression Level: 8, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8, Long Mode - Decompression Speed b a c 300 600 900 1200 1500 1163.4 1161.6 1129.3 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 8, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8, Long Mode - Compression Speed a c b 60 120 180 240 300 295.0 285.6 278.4 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 3, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3, Long Mode - Decompression Speed a c b 300 600 900 1200 1500 1190.2 1189.7 1172.3 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 3, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3, Long Mode - Compression Speed b c a 70 140 210 280 350 313.7 267.4 258.0 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 3 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3 - Decompression Speed b a c 200 400 600 800 1000 1132.3 1128.4 1126.3 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 3 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3 - Compression Speed c b a 500 1000 1500 2000 2500 2209.8 2185.0 2013.1 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 8 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8 - Decompression Speed b a c 200 400 600 800 1000 1140.4 1128.8 831.7 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 8 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8 - Compression Speed c a b 140 280 420 560 700 638.2 624.0 603.0 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.5 Blend File: Classroom - Compute: CPU-Only b a c 14 28 42 56 70 62.71 63.07 63.14
Zstd Compression Compression Level: 12 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 12 - Decompression Speed a b c 200 400 600 800 1000 1095.6 1075.5 1072.1 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 12 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 12 - Compression Speed b c a 40 80 120 160 200 180.6 169.9 161.4 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 19 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Decompression Speed b a c 200 400 600 800 1000 924.0 907.7 907.0 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 19 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Compression Speed a c b 4 8 12 16 20 14.6 14.4 14.3 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
RocksDB Test: Random Fill Sync OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Random Fill Sync b c a 16K 32K 48K 64K 80K 74771 74573 73440 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
AOM AV1 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.6 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 4K c b a 0.0765 0.153 0.2295 0.306 0.3825 0.34 0.34 0.33 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
RocksDB Test: Random Fill OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Random Fill b a c 20K 40K 60K 80K 100K 104356 104159 104086 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU b a 7 14 21 28 35 SE +/- 8.531398, N = 12 0.668506 30.271432 MIN: 0.45 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
John The Ripper Test: MD5 OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: MD5 a b c 2M 4M 6M 8M 10M 10384000 10295000 2766000 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt -lbz2
RocksDB Test: Update Random OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Update Random b a c 20K 40K 60K 80K 100K 97861 97772 97722 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Read Random Write Random OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Read Random Write Random b c a 200K 400K 600K 800K 1000K 959340 954852 920426 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
John The Ripper Test: HMAC-SHA512 OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: HMAC-SHA512 b a c 30M 60M 90M 120M 150M 152981000 151993000 19292000 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt -lbz2
RocksDB Test: Read While Writing OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Read While Writing c a b 2M 4M 6M 8M 10M 8192793 8059987 7531197 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Random Read OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Random Read c a b 60M 120M 180M 240M 300M 273197200 273039094 268195000 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenSSL Algorithm: RSA4096 OpenBenchmarking.org verify/s, More Is Better OpenSSL 3.1 Algorithm: RSA4096 a c b 300K 600K 900K 1200K 1500K 1189343.7 1188772.4 1186897.4 1. (CC) gcc options: -pthread -m64 -O3 -ldl
OpenSSL Algorithm: RSA4096 OpenBenchmarking.org sign/s, More Is Better OpenSSL 3.1 Algorithm: RSA4096 c a b 8K 16K 24K 32K 40K 36977.5 36965.0 36899.1 1. (CC) gcc options: -pthread -m64 -O3 -ldl
TensorFlow Device: CPU - Batch Size: 64 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 64 - Model: GoogLeNet a b c 30 60 90 120 150 132.43 132.26 131.98
dav1d Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better dav1d 1.1 Video Input: Summer Nature 4K a c b 15 30 45 60 75 66.17 65.70 65.21 1. (CC) gcc options: -pthread
VP9 libvpx Encoding Speed: Speed 5 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better VP9 libvpx Encoding 1.13 Speed: Speed 5 - Input: Bosphorus 1080p b a c 3 6 9 12 15 11.53 11.42 11.31 1. (CXX) g++ options: -m64 -lm -lpthread -O3 -fPIC -U_FORTIFY_SOURCE -std=gnu++11
AOM AV1 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.6 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 1080p c a b 4 8 12 16 20 16.07 15.97 15.58 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
VVenC Video Input: Bosphorus 1080p - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 1080p - Video Preset: Faster a b c 3 6 9 12 15 13.35 13.10 12.98 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU a b 0.8852 1.7704 2.6556 3.5408 4.426 SE +/- 0.00278, N = 3 3.90427 3.93436 MIN: 3.68 MIN: 3.68 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU a b 0.0854 0.1708 0.2562 0.3416 0.427 SE +/- 0.003364, N = 3 0.373237 0.379540 MIN: 0.33 MIN: 0.33 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
AOM AV1 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.6 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4K c a b 4 8 12 16 20 15.51 15.39 15.36 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
AOM AV1 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.6 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 4K c a b 4 8 12 16 20 16.33 16.15 15.53 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
TensorFlow Device: CPU - Batch Size: 32 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 32 - Model: GoogLeNet b c a 20 40 60 80 100 101.79 101.18 100.57
AOM AV1 Encoder Mode: Speed 10 Realtime - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.6 Encoder Mode: Speed 10 Realtime - Input: Bosphorus 4K b c a 4 8 12 16 20 16.43 16.33 15.82 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
AOM AV1 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.6 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 4K c b a 4 8 12 16 20 16.63 16.56 16.04 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
dav1d Video Input: Summer Nature 1080p OpenBenchmarking.org FPS, More Is Better dav1d 1.1 Video Input: Summer Nature 1080p a b c 20 40 60 80 100 96.80 96.07 95.86 1. (CC) gcc options: -pthread
SPECFEM3D Model: Water-layered Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Water-layered Halfspace b c a 7 14 21 28 35 30.37 30.96 31.47 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
TensorFlow Device: CPU - Batch Size: 64 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 64 - Model: AlexNet a b c 50 100 150 200 250 236.08 228.16 227.64
SPECFEM3D Model: Layered Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Layered Halfspace b a c 7 14 21 28 35 29.52 29.64 29.65 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Stress-NG Test: Atomic OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Atomic c b a 80 160 240 320 400 349.89 163.26 139.80 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Blender Blend File: Fishy Cat - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.5 Blend File: Fishy Cat - Compute: CPU-Only c a b 7 14 21 28 35 32.07 32.22 32.29
Stress-NG Test: Futex OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Futex c b a 200K 400K 600K 800K 1000K 1056011.64 1018129.24 957096.20 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Stress-NG Test: Pthread OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Pthread c a b 20K 40K 60K 80K 100K 90272.05 87858.71 87532.40 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Stress-NG Test: Zlib OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Zlib c a b 1400 2800 4200 5600 7000 6511.98 6374.22 5549.21 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Stress-NG Test: Socket Activity OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Socket Activity a b c 20K 40K 60K 80K 100K 94302.89 79477.34 50201.57 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Stress-NG Test: Memory Copying OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Memory Copying a b c 2K 4K 6K 8K 10K 10941.37 10922.24 10908.20 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
John The Ripper Test: WPA PSK OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: WPA PSK a b c 100K 200K 300K 400K 500K 483328 482323 255829 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt -lbz2
Stress-NG Test: IO_uring OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: IO_uring c b a 6K 12K 18K 24K 30K 26405.01 26364.63 26355.62 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
John The Ripper Test: bcrypt OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: bcrypt b a c 20K 40K 60K 80K 100K 114825 114777 64015 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt -lbz2
John The Ripper Test: Blowfish OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: Blowfish a b c 20K 40K 60K 80K 100K 114499 107388 51517 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt -lbz2
Stress-NG Test: CPU Cache OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: CPU Cache b c a 400K 800K 1200K 1600K 2000K 1691720.80 1657780.82 1441584.05 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Stress-NG Test: Malloc OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Malloc b c a 40M 80M 120M 160M 200M 191206280.20 190376822.58 190168476.52 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Stress-NG Test: System V Message Passing OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: System V Message Passing c a b 16M 32M 48M 64M 80M 73426960.15 10295375.81 10283641.13 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Stress-NG Test: MEMFD OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: MEMFD a b c 170 340 510 680 850 766.90 583.83 583.20 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Stress-NG Test: NUMA OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: NUMA a c b 90 180 270 360 450 437.26 433.52 432.40 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Stress-NG Test: Poll OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Poll b a c 2M 4M 6M 8M 10M 9566416.75 9551392.98 9500903.35 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Stress-NG Test: Glibc C String Functions OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Glibc C String Functions a b c 13M 26M 39M 52M 65M 60467587.40 59352709.91 58278938.78 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Stress-NG Test: Semaphores OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Semaphores b c a 3M 6M 9M 12M 15M 12234144.91 12232435.50 11840389.35 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Stress-NG Test: Vector Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Vector Math a b c 70K 140K 210K 280K 350K 306730.85 306299.59 306225.09 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Stress-NG Test: Matrix Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Matrix Math c b a 70K 140K 210K 280K 350K 341249.75 341137.35 335705.12 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Stress-NG Test: CPU Stress OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: CPU Stress c a b 30K 60K 90K 120K 150K 157803.19 152655.29 152455.93 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Stress-NG Test: Hash OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Hash b c a 3M 6M 9M 12M 15M 13204487.51 13066446.38 12949136.28 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Stress-NG Test: SENDFILE OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: SENDFILE c b a 200K 400K 600K 800K 1000K 1165355.20 1162795.27 1143575.46 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Stress-NG Test: Function Call OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Function Call c a b 90K 180K 270K 360K 450K 431183.83 429335.83 413303.53 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Stress-NG Test: Mutex OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Mutex b a c 8M 16M 24M 32M 40M 36505304.60 36280764.76 36276764.04 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Stress-NG Test: Forking OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Forking b a c 14K 28K 42K 56K 70K 66467.95 65824.56 65696.65 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Stress-NG Test: Crypto OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Crypto b c a 20K 40K 60K 80K 100K 108459.26 107612.44 105894.93 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Stress-NG Test: MMAP OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: MMAP a b c 700 1400 2100 2800 3500 3124.36 3115.23 3092.95 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Stress-NG Test: x86_64 RdRand OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: x86_64 RdRand a c b 140K 280K 420K 560K 700K 658591.72 658542.68 658461.44 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Stress-NG Test: Glibc Qsort Data Sorting OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Glibc Qsort Data Sorting a c b 300 600 900 1200 1500 1550.53 1547.20 1546.69 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Stress-NG Test: Context Switching OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.06 Test: Context Switching c a b 800K 1600K 2400K 3200K 4000K 3511599.94 3511158.35 3508321.29 1. (CC) gcc options: -std=gnu99 -O2 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
TensorFlow Device: CPU - Batch Size: 16 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: GoogLeNet a b c 16 32 48 64 80 73.81 72.81 72.70
GROMACS Implementation: MPI CPU - Input: water_GMX50_bare OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2023 Implementation: MPI CPU - Input: water_GMX50_bare a c b 3 6 9 12 15 9.268 9.259 9.199 1. (CXX) g++ options: -O3
AOM AV1 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.6 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 1080p c b a 0.1643 0.3286 0.4929 0.6572 0.8215 0.73 0.73 0.73 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
Darmstadt Automotive Parallel Heterogeneous Suite Backend: OpenMP - Kernel: Euclidean Cluster OpenBenchmarking.org Test Cases Per Minute, More Is Better Darmstadt Automotive Parallel Heterogeneous Suite 2021.11.02 Backend: OpenMP - Kernel: Euclidean Cluster c a b 200 400 600 800 1000 848.69 842.32 796.63 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp
Darmstadt Automotive Parallel Heterogeneous Suite Backend: OpenMP - Kernel: NDT Mapping OpenBenchmarking.org Test Cases Per Minute, More Is Better Darmstadt Automotive Parallel Heterogeneous Suite 2021.11.02 Backend: OpenMP - Kernel: NDT Mapping c a b 200 400 600 800 1000 892.13 817.52 771.26 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.5 Blend File: BMW27 - Compute: CPU-Only a b c 6 12 18 24 30 24.20 24.36 24.42
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.4 Encoder Mode: Preset 8 - Input: Bosphorus 4K b a c 9 18 27 36 45 38.18 37.74 37.33 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
TensorFlow Device: CPU - Batch Size: 32 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 32 - Model: AlexNet a c b 40 80 120 160 200 178.52 175.31 174.81
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU a b 0.9371 1.8742 2.8113 3.7484 4.6855 SE +/- 0.04476, N = 14 3.67056 4.16467 MIN: 3.52 MIN: 3.58 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
AOM AV1 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.6 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 1080p a c b 7 14 21 28 35 28.78 28.61 27.93 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
SPECFEM3D Model: Homogeneous Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Homogeneous Halfspace a b c 4 8 12 16 20 18.03 18.18 18.29 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
AOM AV1 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.6 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 1080p b c a 7 14 21 28 35 29.81 29.56 29.39 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
AOM AV1 Encoder Mode: Speed 10 Realtime - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.6 Encoder Mode: Speed 10 Realtime - Input: Bosphorus 1080p c b a 7 14 21 28 35 30.98 30.21 30.04 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
Embree Binary: Pathtracer - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer - Model: Asian Dragon Obj b c a 20 40 60 80 100 76.66 76.41 76.17 MIN: 74.78 / MAX: 80.24 MIN: 74.48 / MAX: 78.97 MIN: 73.69 / MAX: 79.67
AOM AV1 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.6 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 1080p a b c 7 14 21 28 35 31.70 31.10 30.01 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
Embree Binary: Pathtracer ISPC - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer ISPC - Model: Asian Dragon Obj b a c 20 40 60 80 100 89.29 89.18 88.89 MIN: 87.48 / MAX: 93.27 MIN: 87.45 / MAX: 91.82 MIN: 87.09 / MAX: 93.32
SPECFEM3D Model: Tomographic Model OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Tomographic Model b c a 4 8 12 16 20 14.71 14.96 15.30 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU a b 0.0491 0.0982 0.1473 0.1964 0.2455 SE +/- 0.001478, N = 11 0.201235 0.218330 MIN: 0.18 MIN: 0.19 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 6.0 Time To Compile a c b 4 8 12 16 20 17.81 17.84 17.87
SPECFEM3D Model: Mount St. Helens OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Mount St. Helens b c a 4 8 12 16 20 13.46 13.57 13.74 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
TensorFlow Device: CPU - Batch Size: 16 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: AlexNet b c a 30 60 90 120 150 145.79 145.45 138.07
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU b a 0.3194 0.6388 0.9582 1.2776 1.597 SE +/- 0.00845, N = 3 1.40563 1.41955 MIN: 1.16 MIN: 1.24 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU b a 0.2637 0.5274 0.7911 1.0548 1.3185 SE +/- 0.00262, N = 3 1.16101 1.17192 MIN: 0.98 MIN: 0.97 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU a b 0.4868 0.9736 1.4604 1.9472 2.434 SE +/- 0.02056, N = 3 2.10120 2.16346 MIN: 2.03 MIN: 2.03 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU a b 0.2082 0.4164 0.6246 0.8328 1.041 SE +/- 0.008834, N = 6 0.875338 0.925478 MIN: 0.82 MIN: 0.82 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Embree Binary: Pathtracer - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer - Model: Crown c b a 16 32 48 64 80 71.89 71.83 71.27 MIN: 67.5 / MAX: 80.61 MIN: 68.18 / MAX: 79.7 MIN: 67.7 / MAX: 79.85
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.4 Encoder Mode: Preset 12 - Input: Bosphorus 4K c b a 30 60 90 120 150 120.06 119.69 116.89 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.4 Encoder Mode: Preset 13 - Input: Bosphorus 4K c b a 20 40 60 80 100 97.57 97.19 97.16 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Embree Binary: Pathtracer - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer - Model: Asian Dragon c a b 20 40 60 80 100 84.59 84.35 84.28 MIN: 82.73 / MAX: 90.07 MIN: 81.19 / MAX: 88.18 MIN: 81.06 / MAX: 88.82
Google Draco Model: Church Facade OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Church Facade b a c 1500 3000 4500 6000 7500 6702 6765 6767 1. (CXX) g++ options: -O3
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer ISPC - Model: Crown b a c 20 40 60 80 100 88.57 88.00 87.85 MIN: 84.55 / MAX: 93.3 MIN: 84.5 / MAX: 93.11 MIN: 84.35 / MAX: 92.29
Google Draco Model: Lion OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Lion b a c 1200 2400 3600 4800 6000 5479 5502 5521 1. (CXX) g++ options: -O3
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer ISPC - Model: Asian Dragon b c a 20 40 60 80 100 104.48 104.28 103.14 MIN: 103 / MAX: 108.93 MIN: 102.35 / MAX: 108.15 MIN: 101.66 / MAX: 108.37
Phoronix Test Suite v10.8.5