dddd AMD Ryzen 7 7840HS testing with a Framework Laptop 16 (AMD Ryzen 7040 ) FRANMZCP07 (03.01 BIOS) and AMD Radeon 780M 512MB on Ubuntu 24.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2411045-NE-DDDD1454570&grs&sro .
dddd Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a aa b d AMD Ryzen 7 7840HS @ 5.29GHz (8 Cores / 16 Threads) Framework Laptop 16 (AMD Ryzen 7040 ) FRANMZCP07 (03.01 BIOS) AMD Device 14e8 2 x 8GB DDR5-5600MT/s A-DATA AD5S56008G-B 512GB Western Digital PC SN810 SDCPNRY-512G AMD Radeon 780M 512MB AMD Navi 31 HDMI/DP MEDIATEK MT7922 802.11ax PCI Ubuntu 24.04 6.8.0-39-generic (x86_64) GNOME Shell 46.0 X Server + Wayland 4.6 Mesa 24.2~git2406200600.0ac0fb~oibaf~n (git-0ac0fbc 2024-06-20 noble-oibaf-ppa) (LLVM 17.0.6 DRM 3.57) GCC 13.2.0 ext4 2560x1600 6.8.0-40-generic (x86_64) OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - Platform Profile: balanced - CPU Microcode: 0xa704103 - ACPI Profile: balanced Graphics Details - BAR1 / Visible vRAM Size: 512 MB Java Details - OpenJDK Runtime Environment (build 21.0.4+7-Ubuntu-1ubuntu224.04) Python Details - Python 3.12.3 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Vulnerable: Safe RET no microcode + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
dddd xnnpack: FP32MobileNetV1 xnnpack: FP16MobileNetV2 xnnpack: FP16MobileNetV3Large xnnpack: FP32MobileNetV3Large xnnpack: FP32MobileNetV2 xnnpack: FP32MobileNetV3Small xnnpack: FP16MobileNetV1 litert: Quantized COCO SSD MobileNet v1 onnx: fcn-resnet101-11 - CPU - Standard onnx: ResNet101_DUC_HDC-12 - CPU - Standard onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard onnx: yolov4 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Standard onnx: ZFNet-512 - CPU - Standard onnx: ResNet101_DUC_HDC-12 - CPU - Parallel encode-opus: WAV To Opus Encode onnx: bertsquad-12 - CPU - Standard byte: Whetstone Double svt-av1: Preset 13 - Beauty 4K 10-bit onnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallel astcenc: Fast astcenc: Medium svt-av1: Preset 8 - Beauty 4K 10-bit astcenc: Thorough astcenc: Very Thorough astcenc: Exhaustive onednn: Deconvolution Batch shapes_3d - CPU svt-av1: Preset 5 - Beauty 4K 10-bit litert: Mobilenet Quant onnx: super-resolution-10 - CPU - Parallel onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Parallel onnx: yolov4 - CPU - Parallel onnx: ArcFace ResNet-100 - CPU - Parallel cassandra: Writes svt-av1: Preset 3 - Beauty 4K 10-bit litert: NASNet Mobile xnnpack: FP16MobileNetV3Small onnx: bertsquad-12 - CPU - Parallel litert: DeepLab V3 svt-av1: Preset 3 - Bosphorus 1080p onnx: super-resolution-10 - CPU - Standard svt-av1: Preset 3 - Bosphorus 4K onnx: ResNet50 v1-12-int8 - CPU - Parallel onednn: IP Shapes 1D - CPU svt-av1: Preset 5 - Bosphorus 1080p xnnpack: QS8MobileNetV2 onnx: ZFNet-512 - CPU - Parallel byte: Pipe byte: System Call litert: SqueezeNet primesieve: 1e13 primesieve: 1e12 byte: Dhrystone 2 svt-av1: Preset 8 - Bosphorus 1080p onednn: Recurrent Neural Network Training - CPU litert: Mobilenet Float warpx: Plasma Acceleration litert: Inception ResNet V2 whisperfile: Tiny litert: Inception V4 svt-av1: Preset 13 - Bosphorus 1080p onednn: Deconvolution Batch shapes_1d - CPU svt-av1: Preset 5 - Bosphorus 4K svt-av1: Preset 8 - Bosphorus 4K onednn: Recurrent Neural Network Inference - CPU whisperfile: Small svt-av1: Preset 13 - Bosphorus 4K stockfish: Chess Benchmark whisperfile: Medium cp2k: Fayalite-FIST onnx: CaffeNet 12-int8 - CPU - Parallel cp2k: H20-256 onnx: CaffeNet 12-int8 - CPU - Standard namd: ATPase with 327,506 Atoms onnx: T5 Encoder - CPU - Standard cp2k: H20-64 onnx: GPT-2 - CPU - Parallel namd: STMV with 1,066,628 Atoms unvanquished: 1920 x 1080 - High onnx: GPT-2 - CPU - Standard unvanquished: 1920 x 1200 - Medium warpx: Uniform Plasma onnx: T5 Encoder - CPU - Parallel onednn: IP Shapes 3D - CPU epoch: Cone unvanquished: 1920 x 1080 - Ultra unvanquished: 1920 x 1200 - Ultra unvanquished: 2560 x 1440 - Ultra unvanquished: 1920 x 1200 - High unvanquished: 2560 x 1440 - Medium unvanquished: 1920 x 1080 - Medium unvanquished: 2560 x 1600 - Ultra unvanquished: 2560 x 1440 - High unvanquished: 2560 x 1600 - Medium onednn: Convolution Batch Shapes Auto - CPU unvanquished: 2560 x 1600 - High paraview: Many Spheres - 600 - 2560 x 1440 paraview: Many Spheres - 600 - 1920 x 1200 paraview: Many Spheres - 600 - 1920 x 1080 onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard onnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallel onnx: ResNet101_DUC_HDC-12 - CPU - Standard onnx: ResNet101_DUC_HDC-12 - CPU - Parallel onnx: super-resolution-10 - CPU - Standard onnx: super-resolution-10 - CPU - Parallel onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: ResNet50 v1-12-int8 - CPU - Parallel onnx: ArcFace ResNet-100 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Parallel onnx: fcn-resnet101-11 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Parallel onnx: CaffeNet 12-int8 - CPU - Standard onnx: CaffeNet 12-int8 - CPU - Parallel onnx: bertsquad-12 - CPU - Standard onnx: bertsquad-12 - CPU - Parallel onnx: T5 Encoder - CPU - Standard onnx: T5 Encoder - CPU - Parallel onnx: ZFNet-512 - CPU - Standard onnx: ZFNet-512 - CPU - Parallel onnx: yolov4 - CPU - Standard onnx: yolov4 - CPU - Parallel onnx: GPT-2 - CPU - Standard onnx: GPT-2 - CPU - Parallel paraview: Many Spheres - 600 - 2560 x 1440 paraview: Many Spheres - 600 - 1920 x 1200 paraview: Many Spheres - 600 - 1920 x 1080 a aa b d 460.1 511.8 418.2 385.3 253.4 437.5 371.2 542.5 233.9 288.6 336 266.5 0.17 0.18 0.18 17.38 17.798 17.859 1536 1337 1429 1255 1057 465 1868 2016.71 1.59037 0.729925 52.3988 9.36551 20.5943 86.771 0.466728 22.172 15.779 155612.4 7.579 40.1989 170.7735 67.6726 5.343 8.8155 1.2075 0.7361 4.86939 3.119 1249.46 71.3414 243.293 0.88519 6.61722 20.9556 115862 0.705 7642.71 581 9.11603 2125.78 14.285 113.981 4.169 231.737 2.51714 49.158 533 62.6579 16671392.5 16049531.4 2288.03 188.66 15.383 719195925.3 163.326 3155.62 1527.48 55.36615002 28571.2 55.18821 33358 497.584 6.10289 15.057 49.912 1770.5 229.32516 121.142 21157157 655.14231 147.309 456.271 1002.985 498.19 1.32956 142.366 99.801 106.355 0.38378 473.3 122.646 512.5 21.17645074 140.288 4.57722 546.93 417.3 384.1 254.7 439.3 373.2 539.8 234.6 289.6 336.2 13.3615 266.7 19.0821 24.8744 1370 2142.57 8.7719 14.0165 4.10903 4.31466 48.5551 47.7186 628.78 1129.7 2.00621 2.19001 63.3729 109.695 7.02169 7.12656 11.5232 15.958 106.772 151.119 8.14774 9.39903 13914 10687 11295 9205 6981 1998 6239 22695.2 1.34498 0.381099 47.839 5.90138 32.4009 60.4683 0.404181 26.284 15.6956 137723.9 6.83 35.462 156.3041 62.1679 4.909 8.1192 1.1154 0.6837 4.92932 2.918 1276.41 68.1345 225.296 0.763448 6.63616 20.257 108743 0.674 7816 606 8.48061 2128.84 13.796 111.026 4.022 227.831 2.54144 47.191 526 62.3166 18027399 17367015.3 2286.41 191.957 15.34 774189636.1 160.583 3104.92 1491.33 56.63241771 28254.4 58.1555 33308.8 489.204 6.13915 14.522 49.519 1727.88 231.05572 119.373 20057178 657.63606 154.827 456.335 1010.618 502.552 1.27220 148.556 96.287 104.496 0.37007 471.6 124.718 517.5 21.69885843 137.129 4.662 554.79 416.3 384.2 253.9 438 373.5 542.4 234 289.8 336.6 13.3262 266.4 20.9008 28.1973 2623.99 2474.13 9.0054 14.6761 4.4372 4.38853 30.8617 49.3639 743.501 1309.84 1.9887 2.18961 63.7087 117.914 6.72951 7.29084 16.5356 16.0452 169.449 150.687 8.01188 9.56576 1617 1533 1602 1319 1130 515 2169 2338.08 0.79605 0.611475 32.9074 8.7288 30.2995 57.4247 0.356134 28.909 12.6292 125618.1 6.224 33.2835 142.7008 56.7288 4.484 7.4152 1.017 0.6236 5.72961 2.659 1458.15 61.2545 209.121 0.809516 5.74356 18.1385 100972 0.619 8630.17 654 8.13821 2378.54 12.769 101.972 3.748 208.591 2.79418 44.332 581 56.7363 16434910.2 15836969.9 2504.03 206.119 16.749 709502408.8 149.784 3379.99 1622.81 60.22498194 30683.2 59.90084 36136.2 460.217 6.58308 13.964 46.575 1844.16 243.97484 114.049 20267956 690.37381 151.938 434.39 1052.128 479.385 1.28715 147.262 100.098 102.473 0.37644 472.3 121.495 504.8 21.70649519 137.211 4.66831 549.16 422.2 380.8 252.4 436.4 373.4 540.7 235 288.7 337 13.3305 266.9 30.3849 30.043 1635.39 2807.93 9.80512 16.3246 4.78049 4.79323 33.002 55.1297 1256.2 1235.3 2.08476 2.30027 79.1783 122.875 6.78853 7.28642 17.4118 17.6235 114.561 174.106 8.22476 9.75407 OpenBenchmarking.org
XNNPACK Model: FP32MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV1 aa b d 3K 6K 9K 12K 15K 1536 13914 1617 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV2 aa b d 2K 4K 6K 8K 10K 1337 10687 1533 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Large aa b d 2K 4K 6K 8K 10K 1429 11295 1602 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Large aa b d 2K 4K 6K 8K 10K 1255 9205 1319 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV2 aa b d 1500 3000 4500 6000 7500 1057 6981 1130 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Small aa b d 400 800 1200 1600 2000 465 1998 515 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV1 aa b d 1300 2600 3900 5200 6500 1868 6239 2169 1. (CXX) g++ options: -O3 -lrt -lm
LiteRT Model: Quantized COCO SSD MobileNet v1 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Quantized COCO SSD MobileNet v1 aa b d 5K 10K 15K 20K 25K 2016.71 22695.20 2338.08
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard aa b d 0.3578 0.7156 1.0734 1.4312 1.789 1.59037 1.34498 0.79605 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard aa b d 0.1642 0.3284 0.4926 0.6568 0.821 0.729925 0.381099 0.611475 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard aa b d 12 24 36 48 60 52.40 47.84 32.91 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: yolov4 - Device: CPU - Executor: Standard aa b d 3 6 9 12 15 9.36551 5.90138 8.72880 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard aa b d 8 16 24 32 40 20.59 32.40 30.30 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ZFNet-512 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ZFNet-512 - Device: CPU - Executor: Standard aa b d 20 40 60 80 100 86.77 60.47 57.42 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel aa b d 0.105 0.21 0.315 0.42 0.525 0.466728 0.404181 0.356134 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.5.2 WAV To Opus Encode aa b d 7 14 21 28 35 22.17 26.28 28.91 1. (CXX) g++ options: -O3 -fvisibility=hidden -logg -lm
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: bertsquad-12 - Device: CPU - Executor: Standard aa b d 4 8 12 16 20 15.78 15.70 12.63 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
BYTE Unix Benchmark Computational Test: Whetstone Double OpenBenchmarking.org MWIPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: Whetstone Double aa b d 30K 60K 90K 120K 150K 155612.4 137723.9 125618.1 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
SVT-AV1 Encoder Mode: Preset 13 - Input: Beauty 4K 10-bit OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 13 - Input: Beauty 4K 10-bit aa b d 2 4 6 8 10 7.579 6.830 6.224 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel aa b d 9 18 27 36 45 40.20 35.46 33.28 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ASTC Encoder Preset: Fast OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Fast aa b d 40 80 120 160 200 170.77 156.30 142.70 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Medium OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Medium aa b d 15 30 45 60 75 67.67 62.17 56.73 1. (CXX) g++ options: -O3 -flto -pthread
SVT-AV1 Encoder Mode: Preset 8 - Input: Beauty 4K 10-bit OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 8 - Input: Beauty 4K 10-bit aa b d 1.2022 2.4044 3.6066 4.8088 6.011 5.343 4.909 4.484 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
ASTC Encoder Preset: Thorough OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Thorough aa b d 2 4 6 8 10 8.8155 8.1192 7.4152 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Very Thorough OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Very Thorough aa b d 0.2717 0.5434 0.8151 1.0868 1.3585 1.2075 1.1154 1.0170 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Exhaustive aa b d 0.1656 0.3312 0.4968 0.6624 0.828 0.7361 0.6837 0.6236 1. (CXX) g++ options: -O3 -flto -pthread
oneDNN Harness: Deconvolution Batch shapes_3d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_3d - Engine: CPU aa b d 1.2892 2.5784 3.8676 5.1568 6.446 4.86939 4.92932 5.72961 MIN: 4.64 MIN: 4.92 MIN: 5.66 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
SVT-AV1 Encoder Mode: Preset 5 - Input: Beauty 4K 10-bit OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 5 - Input: Beauty 4K 10-bit aa b d 0.7018 1.4036 2.1054 2.8072 3.509 3.119 2.918 2.659 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
LiteRT Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Quant aa b d 300 600 900 1200 1500 1249.46 1276.41 1458.15
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: super-resolution-10 - Device: CPU - Executor: Parallel aa b d 16 32 48 64 80 71.34 68.13 61.25 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard aa b d 50 100 150 200 250 243.29 225.30 209.12 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel aa b d 0.1992 0.3984 0.5976 0.7968 0.996 0.885190 0.763448 0.809516 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: yolov4 - Device: CPU - Executor: Parallel aa b d 2 4 6 8 10 6.61722 6.63616 5.74356 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel aa b d 5 10 15 20 25 20.96 20.26 18.14 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
Apache Cassandra Test: Writes OpenBenchmarking.org Op/s, More Is Better Apache Cassandra 5.0 Test: Writes aa b d 20K 40K 60K 80K 100K 115862 108743 100972
SVT-AV1 Encoder Mode: Preset 3 - Input: Beauty 4K 10-bit OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 3 - Input: Beauty 4K 10-bit aa b d 0.1586 0.3172 0.4758 0.6344 0.793 0.705 0.674 0.619 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
LiteRT Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: NASNet Mobile aa b d 2K 4K 6K 8K 10K 7642.71 7816.00 8630.17
XNNPACK Model: FP16MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Small aa b d 140 280 420 560 700 581 606 654 1. (CXX) g++ options: -O3 -lrt -lm
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: bertsquad-12 - Device: CPU - Executor: Parallel aa b d 3 6 9 12 15 9.11603 8.48061 8.13821 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
LiteRT Model: DeepLab V3 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: DeepLab V3 aa b d 500 1000 1500 2000 2500 2125.78 2128.84 2378.54
SVT-AV1 Encoder Mode: Preset 3 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 3 - Input: Bosphorus 1080p aa b d 4 8 12 16 20 14.29 13.80 12.77 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: super-resolution-10 - Device: CPU - Executor: Standard aa b d 30 60 90 120 150 113.98 111.03 101.97 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
SVT-AV1 Encoder Mode: Preset 3 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 3 - Input: Bosphorus 4K aa b d 0.938 1.876 2.814 3.752 4.69 4.169 4.022 3.748 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel aa b d 50 100 150 200 250 231.74 227.83 208.59 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
oneDNN Harness: IP Shapes 1D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 1D - Engine: CPU aa b d 0.6287 1.2574 1.8861 2.5148 3.1435 2.51714 2.54144 2.79418 MIN: 2.4 MIN: 2.43 MIN: 2.66 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
SVT-AV1 Encoder Mode: Preset 5 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 5 - Input: Bosphorus 1080p aa b d 11 22 33 44 55 49.16 47.19 44.33 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
XNNPACK Model: QS8MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: QS8MobileNetV2 aa b d 130 260 390 520 650 533 526 581 1. (CXX) g++ options: -O3 -lrt -lm
ONNX Runtime Model: ZFNet-512 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ZFNet-512 - Device: CPU - Executor: Parallel aa b d 14 28 42 56 70 62.66 62.32 56.74 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
BYTE Unix Benchmark Computational Test: Pipe OpenBenchmarking.org LPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: Pipe aa b d 4M 8M 12M 16M 20M 16671392.5 18027399.0 16434910.2 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
BYTE Unix Benchmark Computational Test: System Call OpenBenchmarking.org LPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: System Call aa b d 4M 8M 12M 16M 20M 16049531.4 17367015.3 15836969.9 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
LiteRT Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: SqueezeNet aa b d 500 1000 1500 2000 2500 2288.03 2286.41 2504.03
Primesieve Length: 1e13 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.5 Length: 1e13 aa b d 50 100 150 200 250 188.66 191.96 206.12 1. (CXX) g++ options: -O3
Primesieve Length: 1e12 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.5 Length: 1e12 aa b d 4 8 12 16 20 15.38 15.34 16.75 1. (CXX) g++ options: -O3
BYTE Unix Benchmark Computational Test: Dhrystone 2 OpenBenchmarking.org LPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: Dhrystone 2 aa b d 170M 340M 510M 680M 850M 719195925.3 774189636.1 709502408.8 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 8 - Input: Bosphorus 1080p aa b d 40 80 120 160 200 163.33 160.58 149.78 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
oneDNN Harness: Recurrent Neural Network Training - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Training - Engine: CPU aa b d 700 1400 2100 2800 3500 3155.62 3104.92 3379.99 MIN: 3145.17 MIN: 3084.26 MIN: 3369.93 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
LiteRT Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Float aa b d 300 600 900 1200 1500 1527.48 1491.33 1622.81
WarpX Input: Plasma Acceleration OpenBenchmarking.org Seconds, Fewer Is Better WarpX 24.10 Input: Plasma Acceleration aa b d 13 26 39 52 65 55.37 56.63 60.22 1. (CXX) g++ options: -O3 -lm
LiteRT Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception ResNet V2 aa b d 7K 14K 21K 28K 35K 28571.2 28254.4 30683.2
Whisperfile Model Size: Tiny OpenBenchmarking.org Seconds, Fewer Is Better Whisperfile 20Aug24 Model Size: Tiny aa b d 13 26 39 52 65 55.19 58.16 59.90
LiteRT Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception V4 aa b d 8K 16K 24K 32K 40K 33358.0 33308.8 36136.2
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 13 - Input: Bosphorus 1080p aa b d 110 220 330 440 550 497.58 489.20 460.22 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
oneDNN Harness: Deconvolution Batch shapes_1d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_1d - Engine: CPU aa b d 2 4 6 8 10 6.10289 6.13915 6.58308 MIN: 4.53 MIN: 4.76 MIN: 5.22 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
SVT-AV1 Encoder Mode: Preset 5 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 5 - Input: Bosphorus 4K aa b d 4 8 12 16 20 15.06 14.52 13.96 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 8 - Input: Bosphorus 4K aa b d 11 22 33 44 55 49.91 49.52 46.58 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
oneDNN Harness: Recurrent Neural Network Inference - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Inference - Engine: CPU aa b d 400 800 1200 1600 2000 1770.50 1727.88 1844.16 MIN: 1763.22 MIN: 1720.93 MIN: 1834.87 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
Whisperfile Model Size: Small OpenBenchmarking.org Seconds, Fewer Is Better Whisperfile 20Aug24 Model Size: Small aa b d 50 100 150 200 250 229.33 231.06 243.97
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 13 - Input: Bosphorus 4K aa b d 30 60 90 120 150 121.14 119.37 114.05 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Stockfish Chess Benchmark OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 17 Chess Benchmark aa b d 5M 10M 15M 20M 25M 21157157 20057178 20267956 1. (CXX) g++ options: -lgcov -m64 -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -funroll-loops -msse -msse3 -mpopcnt -mavx2 -mbmi -mavx512f -mavx512bw -mavx512vnni -mavx512dq -mavx512vl -msse4.1 -mssse3 -msse2 -mbmi2 -flto -flto-partition=one -flto=jobserver
Whisperfile Model Size: Medium OpenBenchmarking.org Seconds, Fewer Is Better Whisperfile 20Aug24 Model Size: Medium aa b d 150 300 450 600 750 655.14 657.64 690.37
CP2K Molecular Dynamics Input: Fayalite-FIST OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2024.3 Input: Fayalite-FIST aa b d 30 60 90 120 150 147.31 154.83 151.94 1. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel aa b d 100 200 300 400 500 456.27 456.34 434.39 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
CP2K Molecular Dynamics Input: H20-256 OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2024.3 Input: H20-256 aa b d 200 400 600 800 1000 1002.99 1010.62 1052.13 1. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard aa b d 110 220 330 440 550 498.19 502.55 479.39 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
NAMD Input: ATPase with 327,506 Atoms OpenBenchmarking.org ns/day, More Is Better NAMD 3.0 Input: ATPase with 327,506 Atoms aa b d 0.2992 0.5984 0.8976 1.1968 1.496 1.32956 1.27220 1.28715
ONNX Runtime Model: T5 Encoder - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: T5 Encoder - Device: CPU - Executor: Standard aa b d 30 60 90 120 150 142.37 148.56 147.26 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
CP2K Molecular Dynamics Input: H20-64 OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2024.3 Input: H20-64 aa b d 20 40 60 80 100 99.80 96.29 100.10 1. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: GPT-2 - Device: CPU - Executor: Parallel aa b d 20 40 60 80 100 106.36 104.50 102.47 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
NAMD Input: STMV with 1,066,628 Atoms OpenBenchmarking.org ns/day, More Is Better NAMD 3.0 Input: STMV with 1,066,628 Atoms aa b d 0.0864 0.1728 0.2592 0.3456 0.432 0.38378 0.37007 0.37644
Unvanquished Resolution: 1920 x 1080 - Effects Quality: High OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 1920 x 1080 - Effects Quality: High a aa b d 100 200 300 400 500 460.1 473.3 471.6 472.3
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: GPT-2 - Device: CPU - Executor: Standard aa b d 30 60 90 120 150 122.65 124.72 121.50 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
Unvanquished Resolution: 1920 x 1200 - Effects Quality: Medium OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 1920 x 1200 - Effects Quality: Medium a aa b d 110 220 330 440 550 511.8 512.5 517.5 504.8
WarpX Input: Uniform Plasma OpenBenchmarking.org Seconds, Fewer Is Better WarpX 24.10 Input: Uniform Plasma aa b d 5 10 15 20 25 21.18 21.70 21.71 1. (CXX) g++ options: -O3 -lm
ONNX Runtime Model: T5 Encoder - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: T5 Encoder - Device: CPU - Executor: Parallel aa b d 30 60 90 120 150 140.29 137.13 137.21 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
oneDNN Harness: IP Shapes 3D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 3D - Engine: CPU aa b d 1.0504 2.1008 3.1512 4.2016 5.252 4.57722 4.66200 4.66831 MIN: 4.47 MIN: 4.52 MIN: 4.52 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
Epoch Epoch3D Deck: Cone OpenBenchmarking.org Seconds, Fewer Is Better Epoch 4.19.4 Epoch3D Deck: Cone aa b d 120 240 360 480 600 546.93 554.79 549.16 1. (F9X) gfortran options: -O3 -std=f2003 -Jobj -lsdf -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Unvanquished Resolution: 1920 x 1080 - Effects Quality: Ultra OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 1920 x 1080 - Effects Quality: Ultra a aa b d 90 180 270 360 450 418.2 417.3 416.3 422.2
Unvanquished Resolution: 1920 x 1200 - Effects Quality: Ultra OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 1920 x 1200 - Effects Quality: Ultra a aa b d 80 160 240 320 400 385.3 384.1 384.2 380.8
Unvanquished Resolution: 2560 x 1440 - Effects Quality: Ultra OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 2560 x 1440 - Effects Quality: Ultra a aa b d 60 120 180 240 300 253.4 254.7 253.9 252.4
Unvanquished Resolution: 1920 x 1200 - Effects Quality: High OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 1920 x 1200 - Effects Quality: High a aa b d 100 200 300 400 500 437.5 439.3 438.0 436.4
Unvanquished Resolution: 2560 x 1440 - Effects Quality: Medium OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 2560 x 1440 - Effects Quality: Medium a aa b d 80 160 240 320 400 371.2 373.2 373.5 373.4
Unvanquished Resolution: 1920 x 1080 - Effects Quality: Medium OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 1920 x 1080 - Effects Quality: Medium a aa b d 120 240 360 480 600 542.5 539.8 542.4 540.7
Unvanquished Resolution: 2560 x 1600 - Effects Quality: Ultra OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 2560 x 1600 - Effects Quality: Ultra a aa b d 50 100 150 200 250 233.9 234.6 234.0 235.0
Unvanquished Resolution: 2560 x 1440 - Effects Quality: High OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 2560 x 1440 - Effects Quality: High a aa b d 60 120 180 240 300 288.6 289.6 289.8 288.7
Unvanquished Resolution: 2560 x 1600 - Effects Quality: Medium OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 2560 x 1600 - Effects Quality: Medium a aa b d 70 140 210 280 350 336.0 336.2 336.6 337.0
oneDNN Harness: Convolution Batch Shapes Auto - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Convolution Batch Shapes Auto - Engine: CPU aa b d 3 6 9 12 15 13.36 13.33 13.33 MIN: 13.14 MIN: 13.14 MIN: 13.11 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
Unvanquished Resolution: 2560 x 1600 - Effects Quality: High OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 2560 x 1600 - Effects Quality: High a aa b d 60 120 180 240 300 266.5 266.7 266.4 266.9
ParaView Test: Many Spheres - Frames: 600 - Resolution: 2560 x 1440 OpenBenchmarking.org Frames / Sec, More Is Better ParaView 5.13 Test: Many Spheres - Frames: 600 - Resolution: 2560 x 1440 a 0.0383 0.0766 0.1149 0.1532 0.1915 0.17
ParaView Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1200 OpenBenchmarking.org Frames / Sec, More Is Better ParaView 5.13 Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1200 a 0.0405 0.081 0.1215 0.162 0.2025 0.18
ParaView Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1080 OpenBenchmarking.org Frames / Sec, More Is Better ParaView 5.13 Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1080 a 0.0405 0.081 0.1215 0.162 0.2025 0.18
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard aa b d 7 14 21 28 35 19.08 20.90 30.38 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel aa b d 7 14 21 28 35 24.87 28.20 30.04 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard aa b d 600 1200 1800 2400 3000 1370.00 2623.99 1635.39 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel aa b d 600 1200 1800 2400 3000 2142.57 2474.13 2807.93 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: super-resolution-10 - Device: CPU - Executor: Standard aa b d 3 6 9 12 15 8.77190 9.00540 9.80512 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: super-resolution-10 - Device: CPU - Executor: Parallel aa b d 4 8 12 16 20 14.02 14.68 16.32 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard aa b d 1.0756 2.1512 3.2268 4.3024 5.378 4.10903 4.43720 4.78049 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel aa b d 1.0785 2.157 3.2355 4.314 5.3925 4.31466 4.38853 4.79323 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard aa b d 11 22 33 44 55 48.56 30.86 33.00 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel aa b d 12 24 36 48 60 47.72 49.36 55.13 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard aa b d 300 600 900 1200 1500 628.78 743.50 1256.20 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel aa b d 300 600 900 1200 1500 1129.70 1309.84 1235.30 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard aa b d 0.4691 0.9382 1.4073 1.8764 2.3455 2.00621 1.98870 2.08476 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel aa b d 0.5176 1.0352 1.5528 2.0704 2.588 2.19001 2.18961 2.30027 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: bertsquad-12 - Device: CPU - Executor: Standard aa b d 20 40 60 80 100 63.37 63.71 79.18 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: bertsquad-12 - Device: CPU - Executor: Parallel aa b d 30 60 90 120 150 109.70 117.91 122.88 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: T5 Encoder - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: T5 Encoder - Device: CPU - Executor: Standard aa b d 2 4 6 8 10 7.02169 6.72951 6.78853 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: T5 Encoder - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: T5 Encoder - Device: CPU - Executor: Parallel aa b d 2 4 6 8 10 7.12656 7.29084 7.28642 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ZFNet-512 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ZFNet-512 - Device: CPU - Executor: Standard aa b d 4 8 12 16 20 11.52 16.54 17.41 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ZFNet-512 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ZFNet-512 - Device: CPU - Executor: Parallel aa b d 4 8 12 16 20 15.96 16.05 17.62 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: yolov4 - Device: CPU - Executor: Standard aa b d 40 80 120 160 200 106.77 169.45 114.56 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: yolov4 - Device: CPU - Executor: Parallel aa b d 40 80 120 160 200 151.12 150.69 174.11 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: GPT-2 - Device: CPU - Executor: Standard aa b d 2 4 6 8 10 8.14774 8.01188 8.22476 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: GPT-2 - Device: CPU - Executor: Parallel aa b d 3 6 9 12 15 9.39903 9.56576 9.75407 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ParaView Test: Many Spheres - Frames: 600 - Resolution: 2560 x 1440 OpenBenchmarking.org MiPolys / Sec, More Is Better ParaView 5.13 Test: Many Spheres - Frames: 600 - Resolution: 2560 x 1440 a 4 8 12 16 20 17.38
ParaView Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1200 OpenBenchmarking.org MiPolys / Sec, More Is Better ParaView 5.13 Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1200 a 4 8 12 16 20 17.80
ParaView Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1080 OpenBenchmarking.org MiPolys / Sec, More Is Better ParaView 5.13 Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1080 a 4 8 12 16 20 17.86
Phoronix Test Suite v10.8.5