dddd AMD Ryzen 7 7840HS testing with a Framework Laptop 16 (AMD Ryzen 7040 ) FRANMZCP07 (03.01 BIOS) and AMD Radeon 780M 512MB on Ubuntu 24.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2411045-NE-DDDD1454570&sro&grr .
dddd Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a aa b d AMD Ryzen 7 7840HS @ 5.29GHz (8 Cores / 16 Threads) Framework Laptop 16 (AMD Ryzen 7040 ) FRANMZCP07 (03.01 BIOS) AMD Device 14e8 2 x 8GB DDR5-5600MT/s A-DATA AD5S56008G-B 512GB Western Digital PC SN810 SDCPNRY-512G AMD Radeon 780M 512MB AMD Navi 31 HDMI/DP MEDIATEK MT7922 802.11ax PCI Ubuntu 24.04 6.8.0-39-generic (x86_64) GNOME Shell 46.0 X Server + Wayland 4.6 Mesa 24.2~git2406200600.0ac0fb~oibaf~n (git-0ac0fbc 2024-06-20 noble-oibaf-ppa) (LLVM 17.0.6 DRM 3.57) GCC 13.2.0 ext4 2560x1600 6.8.0-40-generic (x86_64) OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - Platform Profile: balanced - CPU Microcode: 0xa704103 - ACPI Profile: balanced Graphics Details - BAR1 / Visible vRAM Size: 512 MB Java Details - OpenJDK Runtime Environment (build 21.0.4+7-Ubuntu-1ubuntu224.04) Python Details - Python 3.12.3 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Vulnerable: Safe RET no microcode + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
dddd paraview: Many Spheres - 600 - 2560 x 1440 paraview: Many Spheres - 600 - 2560 x 1440 paraview: Many Spheres - 600 - 1920 x 1200 paraview: Many Spheres - 600 - 1920 x 1200 paraview: Many Spheres - 600 - 1920 x 1080 paraview: Many Spheres - 600 - 1920 x 1080 svt-av1: Preset 3 - Beauty 4K 10-bit cp2k: H20-256 whisperfile: Medium epoch: Cone svt-av1: Preset 3 - Bosphorus 4K svt-av1: Preset 5 - Beauty 4K 10-bit byte: Whetstone Double svt-av1: Preset 3 - Bosphorus 1080p byte: Pipe byte: System Call byte: Dhrystone 2 svt-av1: Preset 8 - Beauty 4K 10-bit whisperfile: Small astcenc: Very Thorough astcenc: Exhaustive xnnpack: QS8MobileNetV2 xnnpack: FP16MobileNetV3Small xnnpack: FP16MobileNetV3Large xnnpack: FP16MobileNetV2 xnnpack: FP16MobileNetV1 xnnpack: FP32MobileNetV3Small xnnpack: FP32MobileNetV3Large xnnpack: FP32MobileNetV2 xnnpack: FP32MobileNetV1 primesieve: 1e13 svt-av1: Preset 13 - Beauty 4K 10-bit svt-av1: Preset 5 - Bosphorus 4K cp2k: Fayalite-FIST namd: STMV with 1,066,628 Atoms stockfish: Chess Benchmark cassandra: Writes cp2k: H20-64 onednn: Recurrent Neural Network Training - CPU svt-av1: Preset 5 - Bosphorus 1080p onednn: Recurrent Neural Network Inference - CPU onnx: ResNet101_DUC_HDC-12 - CPU - Parallel onnx: ResNet101_DUC_HDC-12 - CPU - Parallel astcenc: Thorough onnx: ResNet101_DUC_HDC-12 - CPU - Standard onnx: ResNet101_DUC_HDC-12 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Parallel onnx: fcn-resnet101-11 - CPU - Parallel onnx: fcn-resnet101-11 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Standard onnx: GPT-2 - CPU - Parallel onnx: GPT-2 - CPU - Parallel onnx: ZFNet-512 - CPU - Parallel onnx: ZFNet-512 - CPU - Parallel onnx: GPT-2 - CPU - Standard onnx: GPT-2 - CPU - Standard onnx: bertsquad-12 - CPU - Parallel onnx: bertsquad-12 - CPU - Parallel onnx: yolov4 - CPU - Parallel onnx: yolov4 - CPU - Parallel onnx: ZFNet-512 - CPU - Standard onnx: ZFNet-512 - CPU - Standard onnx: T5 Encoder - CPU - Parallel onnx: T5 Encoder - CPU - Parallel onnx: yolov4 - CPU - Standard onnx: yolov4 - CPU - Standard onnx: bertsquad-12 - CPU - Standard onnx: bertsquad-12 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Parallel onnx: ArcFace ResNet-100 - CPU - Parallel onnx: T5 Encoder - CPU - Standard onnx: T5 Encoder - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Standard onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard onnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallel onnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallel warpx: Plasma Acceleration onnx: CaffeNet 12-int8 - CPU - Parallel onnx: CaffeNet 12-int8 - CPU - Parallel onnx: CaffeNet 12-int8 - CPU - Standard onnx: CaffeNet 12-int8 - CPU - Standard onnx: ResNet50 v1-12-int8 - CPU - Parallel onnx: ResNet50 v1-12-int8 - CPU - Parallel onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: super-resolution-10 - CPU - Parallel onnx: super-resolution-10 - CPU - Parallel onnx: super-resolution-10 - CPU - Standard onnx: super-resolution-10 - CPU - Standard whisperfile: Tiny svt-av1: Preset 8 - Bosphorus 4K litert: Inception V4 litert: Inception ResNet V2 litert: NASNet Mobile litert: Quantized COCO SSD MobileNet v1 litert: Mobilenet Float litert: DeepLab V3 litert: SqueezeNet litert: Mobilenet Quant namd: ATPase with 327,506 Atoms unvanquished: 2560 x 1600 - Ultra unvanquished: 2560 x 1440 - Ultra unvanquished: 2560 x 1600 - High astcenc: Fast unvanquished: 2560 x 1440 - High unvanquished: 1920 x 1200 - Ultra unvanquished: 1920 x 1080 - Ultra warpx: Uniform Plasma encode-opus: WAV To Opus Encode unvanquished: 2560 x 1600 - Medium onednn: Deconvolution Batch shapes_1d - CPU unvanquished: 1920 x 1200 - High unvanquished: 2560 x 1440 - Medium unvanquished: 1920 x 1080 - High svt-av1: Preset 8 - Bosphorus 1080p svt-av1: Preset 13 - Bosphorus 4K astcenc: Medium unvanquished: 1920 x 1200 - Medium unvanquished: 1920 x 1080 - Medium primesieve: 1e12 onednn: IP Shapes 1D - CPU onednn: IP Shapes 3D - CPU svt-av1: Preset 13 - Bosphorus 1080p onednn: Convolution Batch Shapes Auto - CPU onednn: Deconvolution Batch shapes_3d - CPU a aa b d 17.38 0.17 17.798 0.18 17.859 0.18 233.9 253.4 266.5 288.6 385.3 418.2 336 437.5 371.2 460.1 511.8 542.5 0.705 1002.985 655.14231 546.93 4.169 3.119 155612.4 14.285 16671392.5 16049531.4 719195925.3 5.343 229.32516 1.2075 0.7361 533 581 1429 1337 1868 465 1255 1057 1536 188.66 7.579 15.057 147.309 0.38378 21157157 115862 99.801 3155.62 49.158 1770.5 2142.57 0.466728 8.8155 1370 0.729925 1129.7 0.88519 628.78 1.59037 9.39903 106.355 15.958 62.6579 8.14774 122.646 109.695 9.11603 151.119 6.61722 11.5232 86.771 7.12656 140.288 106.772 9.36551 63.3729 15.779 47.7186 20.9556 7.02169 142.366 48.5551 20.5943 19.0821 52.3988 24.8744 40.1989 55.36615002 2.19001 456.271 2.00621 498.19 4.31466 231.737 4.10903 243.293 14.0165 71.3414 8.7719 113.981 55.18821 49.912 33358 28571.2 7642.71 2016.71 1527.48 2125.78 2288.03 1249.46 1.32956 234.6 254.7 266.7 170.7735 289.6 384.1 417.3 21.17645074 22.172 336.2 6.10289 439.3 373.2 473.3 163.326 121.142 67.6726 512.5 539.8 15.383 2.51714 4.57722 497.584 13.3615 4.86939 0.674 1010.618 657.63606 554.79 4.022 2.918 137723.9 13.796 18027399 17367015.3 774189636.1 4.909 231.05572 1.1154 0.6837 526 606 11295 10687 6239 1998 9205 6981 13914 191.957 6.83 14.522 154.827 0.37007 20057178 108743 96.287 3104.92 47.191 1727.88 2474.13 0.404181 8.1192 2623.99 0.381099 1309.84 0.763448 743.501 1.34498 9.56576 104.496 16.0452 62.3166 8.01188 124.718 117.914 8.48061 150.687 6.63616 16.5356 60.4683 7.29084 137.129 169.449 5.90138 63.7087 15.6956 49.3639 20.257 6.72951 148.556 30.8617 32.4009 20.9008 47.839 28.1973 35.462 56.63241771 2.18961 456.335 1.9887 502.552 4.38853 227.831 4.4372 225.296 14.6761 68.1345 9.0054 111.026 58.1555 49.519 33308.8 28254.4 7816 22695.2 1491.33 2128.84 2286.41 1276.41 1.27220 234 253.9 266.4 156.3041 289.8 384.2 416.3 21.69885843 26.284 336.6 6.13915 438 373.5 471.6 160.583 119.373 62.1679 517.5 542.4 15.34 2.54144 4.662 489.204 13.3262 4.92932 0.619 1052.128 690.37381 549.16 3.748 2.659 125618.1 12.769 16434910.2 15836969.9 709502408.8 4.484 243.97484 1.017 0.6236 581 654 1602 1533 2169 515 1319 1130 1617 206.119 6.224 13.964 151.938 0.37644 20267956 100972 100.098 3379.99 44.332 1844.16 2807.93 0.356134 7.4152 1635.39 0.611475 1235.3 0.809516 1256.2 0.79605 9.75407 102.473 17.6235 56.7363 8.22476 121.495 122.875 8.13821 174.106 5.74356 17.4118 57.4247 7.28642 137.211 114.561 8.7288 79.1783 12.6292 55.1297 18.1385 6.78853 147.262 33.002 30.2995 30.3849 32.9074 30.043 33.2835 60.22498194 2.30027 434.39 2.08476 479.385 4.79323 208.591 4.78049 209.121 16.3246 61.2545 9.80512 101.972 59.90084 46.575 36136.2 30683.2 8630.17 2338.08 1622.81 2378.54 2504.03 1458.15 1.28715 235 252.4 266.9 142.7008 288.7 380.8 422.2 21.70649519 28.909 337 6.58308 436.4 373.4 472.3 149.784 114.049 56.7288 504.8 540.7 16.749 2.79418 4.66831 460.217 13.3305 5.72961 OpenBenchmarking.org
ParaView Test: Many Spheres - Frames: 600 - Resolution: 2560 x 1440 OpenBenchmarking.org MiPolys / Sec, More Is Better ParaView 5.13 Test: Many Spheres - Frames: 600 - Resolution: 2560 x 1440 a 4 8 12 16 20 17.38
ParaView Test: Many Spheres - Frames: 600 - Resolution: 2560 x 1440 OpenBenchmarking.org Frames / Sec, More Is Better ParaView 5.13 Test: Many Spheres - Frames: 600 - Resolution: 2560 x 1440 a 0.0383 0.0766 0.1149 0.1532 0.1915 0.17
ParaView Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1200 OpenBenchmarking.org MiPolys / Sec, More Is Better ParaView 5.13 Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1200 a 4 8 12 16 20 17.80
ParaView Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1200 OpenBenchmarking.org Frames / Sec, More Is Better ParaView 5.13 Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1200 a 0.0405 0.081 0.1215 0.162 0.2025 0.18
ParaView Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1080 OpenBenchmarking.org MiPolys / Sec, More Is Better ParaView 5.13 Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1080 a 4 8 12 16 20 17.86
ParaView Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1080 OpenBenchmarking.org Frames / Sec, More Is Better ParaView 5.13 Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1080 a 0.0405 0.081 0.1215 0.162 0.2025 0.18
SVT-AV1 Encoder Mode: Preset 3 - Input: Beauty 4K 10-bit OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 3 - Input: Beauty 4K 10-bit aa b d 0.1586 0.3172 0.4758 0.6344 0.793 0.705 0.674 0.619 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
CP2K Molecular Dynamics Input: H20-256 OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2024.3 Input: H20-256 aa b d 200 400 600 800 1000 1002.99 1010.62 1052.13 1. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
Whisperfile Model Size: Medium OpenBenchmarking.org Seconds, Fewer Is Better Whisperfile 20Aug24 Model Size: Medium aa b d 150 300 450 600 750 655.14 657.64 690.37
Epoch Epoch3D Deck: Cone OpenBenchmarking.org Seconds, Fewer Is Better Epoch 4.19.4 Epoch3D Deck: Cone aa b d 120 240 360 480 600 546.93 554.79 549.16 1. (F9X) gfortran options: -O3 -std=f2003 -Jobj -lsdf -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
SVT-AV1 Encoder Mode: Preset 3 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 3 - Input: Bosphorus 4K aa b d 0.938 1.876 2.814 3.752 4.69 4.169 4.022 3.748 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 5 - Input: Beauty 4K 10-bit OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 5 - Input: Beauty 4K 10-bit aa b d 0.7018 1.4036 2.1054 2.8072 3.509 3.119 2.918 2.659 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
BYTE Unix Benchmark Computational Test: Whetstone Double OpenBenchmarking.org MWIPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: Whetstone Double aa b d 30K 60K 90K 120K 150K 155612.4 137723.9 125618.1 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
SVT-AV1 Encoder Mode: Preset 3 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 3 - Input: Bosphorus 1080p aa b d 4 8 12 16 20 14.29 13.80 12.77 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
BYTE Unix Benchmark Computational Test: Pipe OpenBenchmarking.org LPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: Pipe aa b d 4M 8M 12M 16M 20M 16671392.5 18027399.0 16434910.2 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
BYTE Unix Benchmark Computational Test: System Call OpenBenchmarking.org LPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: System Call aa b d 4M 8M 12M 16M 20M 16049531.4 17367015.3 15836969.9 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
BYTE Unix Benchmark Computational Test: Dhrystone 2 OpenBenchmarking.org LPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: Dhrystone 2 aa b d 170M 340M 510M 680M 850M 719195925.3 774189636.1 709502408.8 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
SVT-AV1 Encoder Mode: Preset 8 - Input: Beauty 4K 10-bit OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 8 - Input: Beauty 4K 10-bit aa b d 1.2022 2.4044 3.6066 4.8088 6.011 5.343 4.909 4.484 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Whisperfile Model Size: Small OpenBenchmarking.org Seconds, Fewer Is Better Whisperfile 20Aug24 Model Size: Small aa b d 50 100 150 200 250 229.33 231.06 243.97
ASTC Encoder Preset: Very Thorough OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Very Thorough aa b d 0.2717 0.5434 0.8151 1.0868 1.3585 1.2075 1.1154 1.0170 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Exhaustive aa b d 0.1656 0.3312 0.4968 0.6624 0.828 0.7361 0.6837 0.6236 1. (CXX) g++ options: -O3 -flto -pthread
XNNPACK Model: QS8MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: QS8MobileNetV2 aa b d 130 260 390 520 650 533 526 581 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Small aa b d 140 280 420 560 700 581 606 654 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Large aa b d 2K 4K 6K 8K 10K 1429 11295 1602 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV2 aa b d 2K 4K 6K 8K 10K 1337 10687 1533 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV1 aa b d 1300 2600 3900 5200 6500 1868 6239 2169 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Small aa b d 400 800 1200 1600 2000 465 1998 515 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Large aa b d 2K 4K 6K 8K 10K 1255 9205 1319 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV2 aa b d 1500 3000 4500 6000 7500 1057 6981 1130 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV1 aa b d 3K 6K 9K 12K 15K 1536 13914 1617 1. (CXX) g++ options: -O3 -lrt -lm
Primesieve Length: 1e13 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.5 Length: 1e13 aa b d 50 100 150 200 250 188.66 191.96 206.12 1. (CXX) g++ options: -O3
SVT-AV1 Encoder Mode: Preset 13 - Input: Beauty 4K 10-bit OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 13 - Input: Beauty 4K 10-bit aa b d 2 4 6 8 10 7.579 6.830 6.224 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 5 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 5 - Input: Bosphorus 4K aa b d 4 8 12 16 20 15.06 14.52 13.96 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
CP2K Molecular Dynamics Input: Fayalite-FIST OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2024.3 Input: Fayalite-FIST aa b d 30 60 90 120 150 147.31 154.83 151.94 1. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
NAMD Input: STMV with 1,066,628 Atoms OpenBenchmarking.org ns/day, More Is Better NAMD 3.0 Input: STMV with 1,066,628 Atoms aa b d 0.0864 0.1728 0.2592 0.3456 0.432 0.38378 0.37007 0.37644
Stockfish Chess Benchmark OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 17 Chess Benchmark aa b d 5M 10M 15M 20M 25M 21157157 20057178 20267956 1. (CXX) g++ options: -lgcov -m64 -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -funroll-loops -msse -msse3 -mpopcnt -mavx2 -mbmi -mavx512f -mavx512bw -mavx512vnni -mavx512dq -mavx512vl -msse4.1 -mssse3 -msse2 -mbmi2 -flto -flto-partition=one -flto=jobserver
Apache Cassandra Test: Writes OpenBenchmarking.org Op/s, More Is Better Apache Cassandra 5.0 Test: Writes aa b d 20K 40K 60K 80K 100K 115862 108743 100972
CP2K Molecular Dynamics Input: H20-64 OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2024.3 Input: H20-64 aa b d 20 40 60 80 100 99.80 96.29 100.10 1. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
oneDNN Harness: Recurrent Neural Network Training - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Training - Engine: CPU aa b d 700 1400 2100 2800 3500 3155.62 3104.92 3379.99 MIN: 3145.17 MIN: 3084.26 MIN: 3369.93 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
SVT-AV1 Encoder Mode: Preset 5 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 5 - Input: Bosphorus 1080p aa b d 11 22 33 44 55 49.16 47.19 44.33 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
oneDNN Harness: Recurrent Neural Network Inference - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Inference - Engine: CPU aa b d 400 800 1200 1600 2000 1770.50 1727.88 1844.16 MIN: 1763.22 MIN: 1720.93 MIN: 1834.87 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
ONNX Runtime Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel aa b d 600 1200 1800 2400 3000 2142.57 2474.13 2807.93 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel aa b d 0.105 0.21 0.315 0.42 0.525 0.466728 0.404181 0.356134 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ASTC Encoder Preset: Thorough OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Thorough aa b d 2 4 6 8 10 8.8155 8.1192 7.4152 1. (CXX) g++ options: -O3 -flto -pthread
ONNX Runtime Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard aa b d 600 1200 1800 2400 3000 1370.00 2623.99 1635.39 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard aa b d 0.1642 0.3284 0.4926 0.6568 0.821 0.729925 0.381099 0.611475 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel aa b d 300 600 900 1200 1500 1129.70 1309.84 1235.30 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel aa b d 0.1992 0.3984 0.5976 0.7968 0.996 0.885190 0.763448 0.809516 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard aa b d 300 600 900 1200 1500 628.78 743.50 1256.20 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard aa b d 0.3578 0.7156 1.0734 1.4312 1.789 1.59037 1.34498 0.79605 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: GPT-2 - Device: CPU - Executor: Parallel aa b d 3 6 9 12 15 9.39903 9.56576 9.75407 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: GPT-2 - Device: CPU - Executor: Parallel aa b d 20 40 60 80 100 106.36 104.50 102.47 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ZFNet-512 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ZFNet-512 - Device: CPU - Executor: Parallel aa b d 4 8 12 16 20 15.96 16.05 17.62 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ZFNet-512 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ZFNet-512 - Device: CPU - Executor: Parallel aa b d 14 28 42 56 70 62.66 62.32 56.74 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: GPT-2 - Device: CPU - Executor: Standard aa b d 2 4 6 8 10 8.14774 8.01188 8.22476 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: GPT-2 - Device: CPU - Executor: Standard aa b d 30 60 90 120 150 122.65 124.72 121.50 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: bertsquad-12 - Device: CPU - Executor: Parallel aa b d 30 60 90 120 150 109.70 117.91 122.88 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: bertsquad-12 - Device: CPU - Executor: Parallel aa b d 3 6 9 12 15 9.11603 8.48061 8.13821 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: yolov4 - Device: CPU - Executor: Parallel aa b d 40 80 120 160 200 151.12 150.69 174.11 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: yolov4 - Device: CPU - Executor: Parallel aa b d 2 4 6 8 10 6.61722 6.63616 5.74356 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ZFNet-512 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ZFNet-512 - Device: CPU - Executor: Standard aa b d 4 8 12 16 20 11.52 16.54 17.41 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ZFNet-512 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ZFNet-512 - Device: CPU - Executor: Standard aa b d 20 40 60 80 100 86.77 60.47 57.42 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: T5 Encoder - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: T5 Encoder - Device: CPU - Executor: Parallel aa b d 2 4 6 8 10 7.12656 7.29084 7.28642 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: T5 Encoder - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: T5 Encoder - Device: CPU - Executor: Parallel aa b d 30 60 90 120 150 140.29 137.13 137.21 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: yolov4 - Device: CPU - Executor: Standard aa b d 40 80 120 160 200 106.77 169.45 114.56 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: yolov4 - Device: CPU - Executor: Standard aa b d 3 6 9 12 15 9.36551 5.90138 8.72880 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: bertsquad-12 - Device: CPU - Executor: Standard aa b d 20 40 60 80 100 63.37 63.71 79.18 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: bertsquad-12 - Device: CPU - Executor: Standard aa b d 4 8 12 16 20 15.78 15.70 12.63 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel aa b d 12 24 36 48 60 47.72 49.36 55.13 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel aa b d 5 10 15 20 25 20.96 20.26 18.14 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: T5 Encoder - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: T5 Encoder - Device: CPU - Executor: Standard aa b d 2 4 6 8 10 7.02169 6.72951 6.78853 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: T5 Encoder - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: T5 Encoder - Device: CPU - Executor: Standard aa b d 30 60 90 120 150 142.37 148.56 147.26 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard aa b d 11 22 33 44 55 48.56 30.86 33.00 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard aa b d 8 16 24 32 40 20.59 32.40 30.30 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard aa b d 7 14 21 28 35 19.08 20.90 30.38 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard aa b d 12 24 36 48 60 52.40 47.84 32.91 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel aa b d 7 14 21 28 35 24.87 28.20 30.04 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel aa b d 9 18 27 36 45 40.20 35.46 33.28 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
WarpX Input: Plasma Acceleration OpenBenchmarking.org Seconds, Fewer Is Better WarpX 24.10 Input: Plasma Acceleration aa b d 13 26 39 52 65 55.37 56.63 60.22 1. (CXX) g++ options: -O3 -lm
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel aa b d 0.5176 1.0352 1.5528 2.0704 2.588 2.19001 2.18961 2.30027 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel aa b d 100 200 300 400 500 456.27 456.34 434.39 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard aa b d 0.4691 0.9382 1.4073 1.8764 2.3455 2.00621 1.98870 2.08476 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard aa b d 110 220 330 440 550 498.19 502.55 479.39 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel aa b d 1.0785 2.157 3.2355 4.314 5.3925 4.31466 4.38853 4.79323 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel aa b d 50 100 150 200 250 231.74 227.83 208.59 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard aa b d 1.0756 2.1512 3.2268 4.3024 5.378 4.10903 4.43720 4.78049 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard aa b d 50 100 150 200 250 243.29 225.30 209.12 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: super-resolution-10 - Device: CPU - Executor: Parallel aa b d 4 8 12 16 20 14.02 14.68 16.32 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: super-resolution-10 - Device: CPU - Executor: Parallel aa b d 16 32 48 64 80 71.34 68.13 61.25 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: super-resolution-10 - Device: CPU - Executor: Standard aa b d 3 6 9 12 15 8.77190 9.00540 9.80512 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: super-resolution-10 - Device: CPU - Executor: Standard aa b d 30 60 90 120 150 113.98 111.03 101.97 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
Whisperfile Model Size: Tiny OpenBenchmarking.org Seconds, Fewer Is Better Whisperfile 20Aug24 Model Size: Tiny aa b d 13 26 39 52 65 55.19 58.16 59.90
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 8 - Input: Bosphorus 4K aa b d 11 22 33 44 55 49.91 49.52 46.58 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
LiteRT Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception V4 aa b d 8K 16K 24K 32K 40K 33358.0 33308.8 36136.2
LiteRT Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception ResNet V2 aa b d 7K 14K 21K 28K 35K 28571.2 28254.4 30683.2
LiteRT Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: NASNet Mobile aa b d 2K 4K 6K 8K 10K 7642.71 7816.00 8630.17
LiteRT Model: Quantized COCO SSD MobileNet v1 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Quantized COCO SSD MobileNet v1 aa b d 5K 10K 15K 20K 25K 2016.71 22695.20 2338.08
LiteRT Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Float aa b d 300 600 900 1200 1500 1527.48 1491.33 1622.81
LiteRT Model: DeepLab V3 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: DeepLab V3 aa b d 500 1000 1500 2000 2500 2125.78 2128.84 2378.54
LiteRT Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: SqueezeNet aa b d 500 1000 1500 2000 2500 2288.03 2286.41 2504.03
LiteRT Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Quant aa b d 300 600 900 1200 1500 1249.46 1276.41 1458.15
NAMD Input: ATPase with 327,506 Atoms OpenBenchmarking.org ns/day, More Is Better NAMD 3.0 Input: ATPase with 327,506 Atoms aa b d 0.2992 0.5984 0.8976 1.1968 1.496 1.32956 1.27220 1.28715
Unvanquished Resolution: 2560 x 1600 - Effects Quality: Ultra OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 2560 x 1600 - Effects Quality: Ultra a aa b d 50 100 150 200 250 233.9 234.6 234.0 235.0
Unvanquished Resolution: 2560 x 1440 - Effects Quality: Ultra OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 2560 x 1440 - Effects Quality: Ultra a aa b d 60 120 180 240 300 253.4 254.7 253.9 252.4
Unvanquished Resolution: 2560 x 1600 - Effects Quality: High OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 2560 x 1600 - Effects Quality: High a aa b d 60 120 180 240 300 266.5 266.7 266.4 266.9
ASTC Encoder Preset: Fast OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Fast aa b d 40 80 120 160 200 170.77 156.30 142.70 1. (CXX) g++ options: -O3 -flto -pthread
Unvanquished Resolution: 2560 x 1440 - Effects Quality: High OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 2560 x 1440 - Effects Quality: High a aa b d 60 120 180 240 300 288.6 289.6 289.8 288.7
Unvanquished Resolution: 1920 x 1200 - Effects Quality: Ultra OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 1920 x 1200 - Effects Quality: Ultra a aa b d 80 160 240 320 400 385.3 384.1 384.2 380.8
Unvanquished Resolution: 1920 x 1080 - Effects Quality: Ultra OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 1920 x 1080 - Effects Quality: Ultra a aa b d 90 180 270 360 450 418.2 417.3 416.3 422.2
WarpX Input: Uniform Plasma OpenBenchmarking.org Seconds, Fewer Is Better WarpX 24.10 Input: Uniform Plasma aa b d 5 10 15 20 25 21.18 21.70 21.71 1. (CXX) g++ options: -O3 -lm
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.5.2 WAV To Opus Encode aa b d 7 14 21 28 35 22.17 26.28 28.91 1. (CXX) g++ options: -O3 -fvisibility=hidden -logg -lm
Unvanquished Resolution: 2560 x 1600 - Effects Quality: Medium OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 2560 x 1600 - Effects Quality: Medium a aa b d 70 140 210 280 350 336.0 336.2 336.6 337.0
oneDNN Harness: Deconvolution Batch shapes_1d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_1d - Engine: CPU aa b d 2 4 6 8 10 6.10289 6.13915 6.58308 MIN: 4.53 MIN: 4.76 MIN: 5.22 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
Unvanquished Resolution: 1920 x 1200 - Effects Quality: High OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 1920 x 1200 - Effects Quality: High a aa b d 100 200 300 400 500 437.5 439.3 438.0 436.4
Unvanquished Resolution: 2560 x 1440 - Effects Quality: Medium OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 2560 x 1440 - Effects Quality: Medium a aa b d 80 160 240 320 400 371.2 373.2 373.5 373.4
Unvanquished Resolution: 1920 x 1080 - Effects Quality: High OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 1920 x 1080 - Effects Quality: High a aa b d 100 200 300 400 500 460.1 473.3 471.6 472.3
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 8 - Input: Bosphorus 1080p aa b d 40 80 120 160 200 163.33 160.58 149.78 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 13 - Input: Bosphorus 4K aa b d 30 60 90 120 150 121.14 119.37 114.05 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
ASTC Encoder Preset: Medium OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Medium aa b d 15 30 45 60 75 67.67 62.17 56.73 1. (CXX) g++ options: -O3 -flto -pthread
Unvanquished Resolution: 1920 x 1200 - Effects Quality: Medium OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 1920 x 1200 - Effects Quality: Medium a aa b d 110 220 330 440 550 511.8 512.5 517.5 504.8
Unvanquished Resolution: 1920 x 1080 - Effects Quality: Medium OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 1920 x 1080 - Effects Quality: Medium a aa b d 120 240 360 480 600 542.5 539.8 542.4 540.7
Primesieve Length: 1e12 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.5 Length: 1e12 aa b d 4 8 12 16 20 15.38 15.34 16.75 1. (CXX) g++ options: -O3
oneDNN Harness: IP Shapes 1D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 1D - Engine: CPU aa b d 0.6287 1.2574 1.8861 2.5148 3.1435 2.51714 2.54144 2.79418 MIN: 2.4 MIN: 2.43 MIN: 2.66 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: IP Shapes 3D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 3D - Engine: CPU aa b d 1.0504 2.1008 3.1512 4.2016 5.252 4.57722 4.66200 4.66831 MIN: 4.47 MIN: 4.52 MIN: 4.52 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 13 - Input: Bosphorus 1080p aa b d 110 220 330 440 550 497.58 489.20 460.22 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
oneDNN Harness: Convolution Batch Shapes Auto - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Convolution Batch Shapes Auto - Engine: CPU aa b d 3 6 9 12 15 13.36 13.33 13.33 MIN: 13.14 MIN: 13.14 MIN: 13.11 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_3d - Engine: CPU aa b d 1.2892 2.5784 3.8676 5.1568 6.446 4.86939 4.92932 5.72961 MIN: 4.64 MIN: 4.92 MIN: 5.66 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
Phoronix Test Suite v10.8.5