dddd AMD Ryzen 7 7840HS testing with a Framework Laptop 16 (AMD Ryzen 7040 ) FRANMZCP07 (03.01 BIOS) and AMD Radeon 780M 512MB on Ubuntu 24.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2411045-NE-DDDD1454570&sor&grt .
dddd Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a aa b d AMD Ryzen 7 7840HS @ 5.29GHz (8 Cores / 16 Threads) Framework Laptop 16 (AMD Ryzen 7040 ) FRANMZCP07 (03.01 BIOS) AMD Device 14e8 2 x 8GB DDR5-5600MT/s A-DATA AD5S56008G-B 512GB Western Digital PC SN810 SDCPNRY-512G AMD Radeon 780M 512MB AMD Navi 31 HDMI/DP MEDIATEK MT7922 802.11ax PCI Ubuntu 24.04 6.8.0-39-generic (x86_64) GNOME Shell 46.0 X Server + Wayland 4.6 Mesa 24.2~git2406200600.0ac0fb~oibaf~n (git-0ac0fbc 2024-06-20 noble-oibaf-ppa) (LLVM 17.0.6 DRM 3.57) GCC 13.2.0 ext4 2560x1600 6.8.0-40-generic (x86_64) OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - Platform Profile: balanced - CPU Microcode: 0xa704103 - ACPI Profile: balanced Graphics Details - BAR1 / Visible vRAM Size: 512 MB Java Details - OpenJDK Runtime Environment (build 21.0.4+7-Ubuntu-1ubuntu224.04) Python Details - Python 3.12.3 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Vulnerable: Safe RET no microcode + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
dddd cassandra: Writes astcenc: Fast astcenc: Medium astcenc: Thorough astcenc: Exhaustive astcenc: Very Thorough byte: Pipe byte: Dhrystone 2 byte: System Call byte: Whetstone Double cp2k: H20-64 cp2k: H20-256 cp2k: Fayalite-FIST epoch: Cone litert: DeepLab V3 litert: SqueezeNet litert: Inception V4 litert: NASNet Mobile litert: Mobilenet Float litert: Mobilenet Quant litert: Inception ResNet V2 litert: Quantized COCO SSD MobileNet v1 namd: ATPase with 327,506 Atoms namd: STMV with 1,066,628 Atoms onednn: IP Shapes 1D - CPU onednn: IP Shapes 3D - CPU onednn: Convolution Batch Shapes Auto - CPU onednn: Deconvolution Batch shapes_1d - CPU onednn: Deconvolution Batch shapes_3d - CPU onednn: Recurrent Neural Network Training - CPU onednn: Recurrent Neural Network Inference - CPU onnx: GPT-2 - CPU - Parallel onnx: GPT-2 - CPU - Parallel onnx: GPT-2 - CPU - Standard onnx: GPT-2 - CPU - Standard onnx: yolov4 - CPU - Parallel onnx: yolov4 - CPU - Parallel onnx: yolov4 - CPU - Standard onnx: yolov4 - CPU - Standard onnx: ZFNet-512 - CPU - Parallel onnx: ZFNet-512 - CPU - Parallel onnx: ZFNet-512 - CPU - Standard onnx: ZFNet-512 - CPU - Standard onnx: T5 Encoder - CPU - Parallel onnx: T5 Encoder - CPU - Parallel onnx: T5 Encoder - CPU - Standard onnx: T5 Encoder - CPU - Standard onnx: bertsquad-12 - CPU - Parallel onnx: bertsquad-12 - CPU - Parallel onnx: bertsquad-12 - CPU - Standard onnx: bertsquad-12 - CPU - Standard onnx: CaffeNet 12-int8 - CPU - Parallel onnx: CaffeNet 12-int8 - CPU - Parallel onnx: CaffeNet 12-int8 - CPU - Standard onnx: CaffeNet 12-int8 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Parallel onnx: fcn-resnet101-11 - CPU - Parallel onnx: fcn-resnet101-11 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Parallel onnx: ArcFace ResNet-100 - CPU - Parallel onnx: ArcFace ResNet-100 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Standard onnx: ResNet50 v1-12-int8 - CPU - Parallel onnx: ResNet50 v1-12-int8 - CPU - Parallel onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: super-resolution-10 - CPU - Parallel onnx: super-resolution-10 - CPU - Parallel onnx: super-resolution-10 - CPU - Standard onnx: super-resolution-10 - CPU - Standard onnx: ResNet101_DUC_HDC-12 - CPU - Parallel onnx: ResNet101_DUC_HDC-12 - CPU - Parallel onnx: ResNet101_DUC_HDC-12 - CPU - Standard onnx: ResNet101_DUC_HDC-12 - CPU - Standard onnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallel onnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallel onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard encode-opus: WAV To Opus Encode paraview: Many Spheres - 600 - 1920 x 1080 paraview: Many Spheres - 600 - 1920 x 1080 paraview: Many Spheres - 600 - 1920 x 1200 paraview: Many Spheres - 600 - 1920 x 1200 paraview: Many Spheres - 600 - 2560 x 1440 paraview: Many Spheres - 600 - 2560 x 1440 primesieve: 1e12 primesieve: 1e13 stockfish: Chess Benchmark svt-av1: Preset 3 - Bosphorus 4K svt-av1: Preset 5 - Bosphorus 4K svt-av1: Preset 8 - Bosphorus 4K svt-av1: Preset 13 - Bosphorus 4K svt-av1: Preset 3 - Bosphorus 1080p svt-av1: Preset 5 - Bosphorus 1080p svt-av1: Preset 8 - Bosphorus 1080p svt-av1: Preset 13 - Bosphorus 1080p svt-av1: Preset 3 - Beauty 4K 10-bit svt-av1: Preset 5 - Beauty 4K 10-bit svt-av1: Preset 8 - Beauty 4K 10-bit svt-av1: Preset 13 - Beauty 4K 10-bit unvanquished: 1920 x 1080 - High unvanquished: 1920 x 1200 - High unvanquished: 2560 x 1440 - High unvanquished: 2560 x 1600 - High unvanquished: 1920 x 1080 - Ultra unvanquished: 1920 x 1200 - Ultra unvanquished: 2560 x 1440 - Ultra unvanquished: 2560 x 1600 - Ultra unvanquished: 1920 x 1080 - Medium unvanquished: 1920 x 1200 - Medium unvanquished: 2560 x 1440 - Medium unvanquished: 2560 x 1600 - Medium warpx: Uniform Plasma warpx: Plasma Acceleration whisperfile: Tiny whisperfile: Small whisperfile: Medium xnnpack: FP32MobileNetV1 xnnpack: FP32MobileNetV2 xnnpack: FP32MobileNetV3Large xnnpack: FP32MobileNetV3Small xnnpack: FP16MobileNetV1 xnnpack: FP16MobileNetV2 xnnpack: FP16MobileNetV3Large xnnpack: FP16MobileNetV3Small xnnpack: QS8MobileNetV2 a aa b d 0.18 17.859 0.18 17.798 0.17 17.38 460.1 437.5 288.6 266.5 418.2 385.3 253.4 233.9 542.5 511.8 371.2 336 115862 170.7735 67.6726 8.8155 0.7361 1.2075 16671392.5 719195925.3 16049531.4 155612.4 99.801 1002.985 147.309 546.93 2125.78 2288.03 33358 7642.71 1527.48 1249.46 28571.2 2016.71 1.32956 0.38378 2.51714 4.57722 13.3615 6.10289 4.86939 3155.62 1770.5 106.355 9.39903 122.646 8.14774 6.61722 151.119 9.36551 106.772 62.6579 15.958 86.771 11.5232 140.288 7.12656 142.366 7.02169 9.11603 109.695 15.779 63.3729 456.271 2.19001 498.19 2.00621 0.88519 1129.7 1.59037 628.78 20.9556 47.7186 20.5943 48.5551 231.737 4.31466 243.293 4.10903 71.3414 14.0165 113.981 8.7719 0.466728 2142.57 0.729925 1370 40.1989 24.8744 52.3988 19.0821 22.172 15.383 188.66 21157157 4.169 15.057 49.912 121.142 14.285 49.158 163.326 497.584 0.705 3.119 5.343 7.579 473.3 439.3 289.6 266.7 417.3 384.1 254.7 234.6 539.8 512.5 373.2 336.2 21.17645074 55.36615002 55.18821 229.32516 655.14231 1536 1057 1255 465 1868 1337 1429 581 533 108743 156.3041 62.1679 8.1192 0.6837 1.1154 18027399 774189636.1 17367015.3 137723.9 96.287 1010.618 154.827 554.79 2128.84 2286.41 33308.8 7816 1491.33 1276.41 28254.4 22695.2 1.27220 0.37007 2.54144 4.662 13.3262 6.13915 4.92932 3104.92 1727.88 104.496 9.56576 124.718 8.01188 6.63616 150.687 5.90138 169.449 62.3166 16.0452 60.4683 16.5356 137.129 7.29084 148.556 6.72951 8.48061 117.914 15.6956 63.7087 456.335 2.18961 502.552 1.9887 0.763448 1309.84 1.34498 743.501 20.257 49.3639 32.4009 30.8617 227.831 4.38853 225.296 4.4372 68.1345 14.6761 111.026 9.0054 0.404181 2474.13 0.381099 2623.99 35.462 28.1973 47.839 20.9008 26.284 15.34 191.957 20057178 4.022 14.522 49.519 119.373 13.796 47.191 160.583 489.204 0.674 2.918 4.909 6.83 471.6 438 289.8 266.4 416.3 384.2 253.9 234 542.4 517.5 373.5 336.6 21.69885843 56.63241771 58.1555 231.05572 657.63606 13914 6981 9205 1998 6239 10687 11295 606 526 100972 142.7008 56.7288 7.4152 0.6236 1.017 16434910.2 709502408.8 15836969.9 125618.1 100.098 1052.128 151.938 549.16 2378.54 2504.03 36136.2 8630.17 1622.81 1458.15 30683.2 2338.08 1.28715 0.37644 2.79418 4.66831 13.3305 6.58308 5.72961 3379.99 1844.16 102.473 9.75407 121.495 8.22476 5.74356 174.106 8.7288 114.561 56.7363 17.6235 57.4247 17.4118 137.211 7.28642 147.262 6.78853 8.13821 122.875 12.6292 79.1783 434.39 2.30027 479.385 2.08476 0.809516 1235.3 0.79605 1256.2 18.1385 55.1297 30.2995 33.002 208.591 4.79323 209.121 4.78049 61.2545 16.3246 101.972 9.80512 0.356134 2807.93 0.611475 1635.39 33.2835 30.043 32.9074 30.3849 28.909 16.749 206.119 20267956 3.748 13.964 46.575 114.049 12.769 44.332 149.784 460.217 0.619 2.659 4.484 6.224 472.3 436.4 288.7 266.9 422.2 380.8 252.4 235 540.7 504.8 373.4 337 21.70649519 60.22498194 59.90084 243.97484 690.37381 1617 1130 1319 515 2169 1533 1602 654 581 OpenBenchmarking.org
Apache Cassandra Test: Writes OpenBenchmarking.org Op/s, More Is Better Apache Cassandra 5.0 Test: Writes aa b d 20K 40K 60K 80K 100K 115862 108743 100972
ASTC Encoder Preset: Fast OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Fast aa b d 40 80 120 160 200 170.77 156.30 142.70 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Medium OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Medium aa b d 15 30 45 60 75 67.67 62.17 56.73 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Thorough OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Thorough aa b d 2 4 6 8 10 8.8155 8.1192 7.4152 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Exhaustive aa b d 0.1656 0.3312 0.4968 0.6624 0.828 0.7361 0.6837 0.6236 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Very Thorough OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Very Thorough aa b d 0.2717 0.5434 0.8151 1.0868 1.3585 1.2075 1.1154 1.0170 1. (CXX) g++ options: -O3 -flto -pthread
BYTE Unix Benchmark Computational Test: Pipe OpenBenchmarking.org LPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: Pipe b aa d 4M 8M 12M 16M 20M 18027399.0 16671392.5 16434910.2 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
BYTE Unix Benchmark Computational Test: Dhrystone 2 OpenBenchmarking.org LPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: Dhrystone 2 b aa d 170M 340M 510M 680M 850M 774189636.1 719195925.3 709502408.8 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
BYTE Unix Benchmark Computational Test: System Call OpenBenchmarking.org LPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: System Call b aa d 4M 8M 12M 16M 20M 17367015.3 16049531.4 15836969.9 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
BYTE Unix Benchmark Computational Test: Whetstone Double OpenBenchmarking.org MWIPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: Whetstone Double aa b d 30K 60K 90K 120K 150K 155612.4 137723.9 125618.1 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
CP2K Molecular Dynamics Input: H20-64 OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2024.3 Input: H20-64 b aa d 20 40 60 80 100 96.29 99.80 100.10 1. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
CP2K Molecular Dynamics Input: H20-256 OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2024.3 Input: H20-256 aa b d 200 400 600 800 1000 1002.99 1010.62 1052.13 1. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
CP2K Molecular Dynamics Input: Fayalite-FIST OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2024.3 Input: Fayalite-FIST aa d b 30 60 90 120 150 147.31 151.94 154.83 1. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
Epoch Epoch3D Deck: Cone OpenBenchmarking.org Seconds, Fewer Is Better Epoch 4.19.4 Epoch3D Deck: Cone aa d b 120 240 360 480 600 546.93 549.16 554.79 1. (F9X) gfortran options: -O3 -std=f2003 -Jobj -lsdf -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
LiteRT Model: DeepLab V3 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: DeepLab V3 aa b d 500 1000 1500 2000 2500 2125.78 2128.84 2378.54
LiteRT Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: SqueezeNet b aa d 500 1000 1500 2000 2500 2286.41 2288.03 2504.03
LiteRT Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception V4 b aa d 8K 16K 24K 32K 40K 33308.8 33358.0 36136.2
LiteRT Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: NASNet Mobile aa b d 2K 4K 6K 8K 10K 7642.71 7816.00 8630.17
LiteRT Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Float b aa d 300 600 900 1200 1500 1491.33 1527.48 1622.81
LiteRT Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Quant aa b d 300 600 900 1200 1500 1249.46 1276.41 1458.15
LiteRT Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception ResNet V2 b aa d 7K 14K 21K 28K 35K 28254.4 28571.2 30683.2
LiteRT Model: Quantized COCO SSD MobileNet v1 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Quantized COCO SSD MobileNet v1 aa d b 5K 10K 15K 20K 25K 2016.71 2338.08 22695.20
NAMD Input: ATPase with 327,506 Atoms OpenBenchmarking.org ns/day, More Is Better NAMD 3.0 Input: ATPase with 327,506 Atoms aa d b 0.2992 0.5984 0.8976 1.1968 1.496 1.32956 1.28715 1.27220
NAMD Input: STMV with 1,066,628 Atoms OpenBenchmarking.org ns/day, More Is Better NAMD 3.0 Input: STMV with 1,066,628 Atoms aa d b 0.0864 0.1728 0.2592 0.3456 0.432 0.38378 0.37644 0.37007
oneDNN Harness: IP Shapes 1D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 1D - Engine: CPU aa b d 0.6287 1.2574 1.8861 2.5148 3.1435 2.51714 2.54144 2.79418 MIN: 2.4 MIN: 2.43 MIN: 2.66 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: IP Shapes 3D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 3D - Engine: CPU aa b d 1.0504 2.1008 3.1512 4.2016 5.252 4.57722 4.66200 4.66831 MIN: 4.47 MIN: 4.52 MIN: 4.52 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Convolution Batch Shapes Auto - Engine: CPU b d aa 3 6 9 12 15 13.33 13.33 13.36 MIN: 13.14 MIN: 13.11 MIN: 13.14 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_1d - Engine: CPU aa b d 2 4 6 8 10 6.10289 6.13915 6.58308 MIN: 4.53 MIN: 4.76 MIN: 5.22 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_3d - Engine: CPU aa b d 1.2892 2.5784 3.8676 5.1568 6.446 4.86939 4.92932 5.72961 MIN: 4.64 MIN: 4.92 MIN: 5.66 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Recurrent Neural Network Training - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Training - Engine: CPU b aa d 700 1400 2100 2800 3500 3104.92 3155.62 3379.99 MIN: 3084.26 MIN: 3145.17 MIN: 3369.93 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Inference - Engine: CPU b aa d 400 800 1200 1600 2000 1727.88 1770.50 1844.16 MIN: 1720.93 MIN: 1763.22 MIN: 1834.87 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: GPT-2 - Device: CPU - Executor: Parallel aa b d 20 40 60 80 100 106.36 104.50 102.47 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: GPT-2 - Device: CPU - Executor: Parallel aa b d 3 6 9 12 15 9.39903 9.56576 9.75407 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: GPT-2 - Device: CPU - Executor: Standard b aa d 30 60 90 120 150 124.72 122.65 121.50 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: GPT-2 - Device: CPU - Executor: Standard b aa d 2 4 6 8 10 8.01188 8.14774 8.22476 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: yolov4 - Device: CPU - Executor: Parallel b aa d 2 4 6 8 10 6.63616 6.61722 5.74356 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: yolov4 - Device: CPU - Executor: Parallel b aa d 40 80 120 160 200 150.69 151.12 174.11 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: yolov4 - Device: CPU - Executor: Standard aa d b 3 6 9 12 15 9.36551 8.72880 5.90138 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: yolov4 - Device: CPU - Executor: Standard aa d b 40 80 120 160 200 106.77 114.56 169.45 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ZFNet-512 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ZFNet-512 - Device: CPU - Executor: Parallel aa b d 14 28 42 56 70 62.66 62.32 56.74 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ZFNet-512 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ZFNet-512 - Device: CPU - Executor: Parallel aa b d 4 8 12 16 20 15.96 16.05 17.62 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ZFNet-512 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ZFNet-512 - Device: CPU - Executor: Standard aa b d 20 40 60 80 100 86.77 60.47 57.42 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ZFNet-512 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ZFNet-512 - Device: CPU - Executor: Standard aa b d 4 8 12 16 20 11.52 16.54 17.41 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: T5 Encoder - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: T5 Encoder - Device: CPU - Executor: Parallel aa d b 30 60 90 120 150 140.29 137.21 137.13 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: T5 Encoder - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: T5 Encoder - Device: CPU - Executor: Parallel aa d b 2 4 6 8 10 7.12656 7.28642 7.29084 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: T5 Encoder - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: T5 Encoder - Device: CPU - Executor: Standard b d aa 30 60 90 120 150 148.56 147.26 142.37 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: T5 Encoder - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: T5 Encoder - Device: CPU - Executor: Standard b d aa 2 4 6 8 10 6.72951 6.78853 7.02169 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: bertsquad-12 - Device: CPU - Executor: Parallel aa b d 3 6 9 12 15 9.11603 8.48061 8.13821 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: bertsquad-12 - Device: CPU - Executor: Parallel aa b d 30 60 90 120 150 109.70 117.91 122.88 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: bertsquad-12 - Device: CPU - Executor: Standard aa b d 4 8 12 16 20 15.78 15.70 12.63 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: bertsquad-12 - Device: CPU - Executor: Standard aa b d 20 40 60 80 100 63.37 63.71 79.18 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel b aa d 100 200 300 400 500 456.34 456.27 434.39 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel b aa d 0.5176 1.0352 1.5528 2.0704 2.588 2.18961 2.19001 2.30027 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard b aa d 110 220 330 440 550 502.55 498.19 479.39 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard b aa d 0.4691 0.9382 1.4073 1.8764 2.3455 1.98870 2.00621 2.08476 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel aa d b 0.1992 0.3984 0.5976 0.7968 0.996 0.885190 0.809516 0.763448 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel aa d b 300 600 900 1200 1500 1129.70 1235.30 1309.84 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard aa b d 0.3578 0.7156 1.0734 1.4312 1.789 1.59037 1.34498 0.79605 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard aa b d 300 600 900 1200 1500 628.78 743.50 1256.20 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel aa b d 5 10 15 20 25 20.96 20.26 18.14 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel aa b d 12 24 36 48 60 47.72 49.36 55.13 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard b d aa 8 16 24 32 40 32.40 30.30 20.59 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard b d aa 11 22 33 44 55 30.86 33.00 48.56 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel aa b d 50 100 150 200 250 231.74 227.83 208.59 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel aa b d 1.0785 2.157 3.2355 4.314 5.3925 4.31466 4.38853 4.79323 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard aa b d 50 100 150 200 250 243.29 225.30 209.12 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard aa b d 1.0756 2.1512 3.2268 4.3024 5.378 4.10903 4.43720 4.78049 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: super-resolution-10 - Device: CPU - Executor: Parallel aa b d 16 32 48 64 80 71.34 68.13 61.25 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: super-resolution-10 - Device: CPU - Executor: Parallel aa b d 4 8 12 16 20 14.02 14.68 16.32 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: super-resolution-10 - Device: CPU - Executor: Standard aa b d 30 60 90 120 150 113.98 111.03 101.97 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: super-resolution-10 - Device: CPU - Executor: Standard aa b d 3 6 9 12 15 8.77190 9.00540 9.80512 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel aa b d 0.105 0.21 0.315 0.42 0.525 0.466728 0.404181 0.356134 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel aa b d 600 1200 1800 2400 3000 2142.57 2474.13 2807.93 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard aa d b 0.1642 0.3284 0.4926 0.6568 0.821 0.729925 0.611475 0.381099 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard aa d b 600 1200 1800 2400 3000 1370.00 1635.39 2623.99 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel aa b d 9 18 27 36 45 40.20 35.46 33.28 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel aa b d 7 14 21 28 35 24.87 28.20 30.04 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard aa b d 12 24 36 48 60 52.40 47.84 32.91 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard aa b d 7 14 21 28 35 19.08 20.90 30.38 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.5.2 WAV To Opus Encode aa b d 7 14 21 28 35 22.17 26.28 28.91 1. (CXX) g++ options: -O3 -fvisibility=hidden -logg -lm
ParaView Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1080 OpenBenchmarking.org Frames / Sec, More Is Better ParaView 5.13 Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1080 a 0.0405 0.081 0.1215 0.162 0.2025 0.18
ParaView Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1080 OpenBenchmarking.org MiPolys / Sec, More Is Better ParaView 5.13 Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1080 a 4 8 12 16 20 17.86
ParaView Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1200 OpenBenchmarking.org Frames / Sec, More Is Better ParaView 5.13 Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1200 a 0.0405 0.081 0.1215 0.162 0.2025 0.18
ParaView Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1200 OpenBenchmarking.org MiPolys / Sec, More Is Better ParaView 5.13 Test: Many Spheres - Frames: 600 - Resolution: 1920 x 1200 a 4 8 12 16 20 17.80
ParaView Test: Many Spheres - Frames: 600 - Resolution: 2560 x 1440 OpenBenchmarking.org Frames / Sec, More Is Better ParaView 5.13 Test: Many Spheres - Frames: 600 - Resolution: 2560 x 1440 a 0.0383 0.0766 0.1149 0.1532 0.1915 0.17
ParaView Test: Many Spheres - Frames: 600 - Resolution: 2560 x 1440 OpenBenchmarking.org MiPolys / Sec, More Is Better ParaView 5.13 Test: Many Spheres - Frames: 600 - Resolution: 2560 x 1440 a 4 8 12 16 20 17.38
Primesieve Length: 1e12 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.5 Length: 1e12 b aa d 4 8 12 16 20 15.34 15.38 16.75 1. (CXX) g++ options: -O3
Primesieve Length: 1e13 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.5 Length: 1e13 aa b d 50 100 150 200 250 188.66 191.96 206.12 1. (CXX) g++ options: -O3
Stockfish Chess Benchmark OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 17 Chess Benchmark aa d b 5M 10M 15M 20M 25M 21157157 20267956 20057178 1. (CXX) g++ options: -lgcov -m64 -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -funroll-loops -msse -msse3 -mpopcnt -mavx2 -mbmi -mavx512f -mavx512bw -mavx512vnni -mavx512dq -mavx512vl -msse4.1 -mssse3 -msse2 -mbmi2 -flto -flto-partition=one -flto=jobserver
SVT-AV1 Encoder Mode: Preset 3 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 3 - Input: Bosphorus 4K aa b d 0.938 1.876 2.814 3.752 4.69 4.169 4.022 3.748 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 5 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 5 - Input: Bosphorus 4K aa b d 4 8 12 16 20 15.06 14.52 13.96 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 8 - Input: Bosphorus 4K aa b d 11 22 33 44 55 49.91 49.52 46.58 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 13 - Input: Bosphorus 4K aa b d 30 60 90 120 150 121.14 119.37 114.05 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 3 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 3 - Input: Bosphorus 1080p aa b d 4 8 12 16 20 14.29 13.80 12.77 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 5 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 5 - Input: Bosphorus 1080p aa b d 11 22 33 44 55 49.16 47.19 44.33 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 8 - Input: Bosphorus 1080p aa b d 40 80 120 160 200 163.33 160.58 149.78 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 13 - Input: Bosphorus 1080p aa b d 110 220 330 440 550 497.58 489.20 460.22 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 3 - Input: Beauty 4K 10-bit OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 3 - Input: Beauty 4K 10-bit aa b d 0.1586 0.3172 0.4758 0.6344 0.793 0.705 0.674 0.619 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 5 - Input: Beauty 4K 10-bit OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 5 - Input: Beauty 4K 10-bit aa b d 0.7018 1.4036 2.1054 2.8072 3.509 3.119 2.918 2.659 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 8 - Input: Beauty 4K 10-bit OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 8 - Input: Beauty 4K 10-bit aa b d 1.2022 2.4044 3.6066 4.8088 6.011 5.343 4.909 4.484 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 13 - Input: Beauty 4K 10-bit OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 13 - Input: Beauty 4K 10-bit aa b d 2 4 6 8 10 7.579 6.830 6.224 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Unvanquished Resolution: 1920 x 1080 - Effects Quality: High OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 1920 x 1080 - Effects Quality: High aa d b a 100 200 300 400 500 473.3 472.3 471.6 460.1
Unvanquished Resolution: 1920 x 1200 - Effects Quality: High OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 1920 x 1200 - Effects Quality: High aa b a d 100 200 300 400 500 439.3 438.0 437.5 436.4
Unvanquished Resolution: 2560 x 1440 - Effects Quality: High OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 2560 x 1440 - Effects Quality: High b aa d a 60 120 180 240 300 289.8 289.6 288.7 288.6
Unvanquished Resolution: 2560 x 1600 - Effects Quality: High OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 2560 x 1600 - Effects Quality: High d aa a b 60 120 180 240 300 266.9 266.7 266.5 266.4
Unvanquished Resolution: 1920 x 1080 - Effects Quality: Ultra OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 1920 x 1080 - Effects Quality: Ultra d a aa b 90 180 270 360 450 422.2 418.2 417.3 416.3
Unvanquished Resolution: 1920 x 1200 - Effects Quality: Ultra OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 1920 x 1200 - Effects Quality: Ultra a b aa d 80 160 240 320 400 385.3 384.2 384.1 380.8
Unvanquished Resolution: 2560 x 1440 - Effects Quality: Ultra OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 2560 x 1440 - Effects Quality: Ultra aa b a d 60 120 180 240 300 254.7 253.9 253.4 252.4
Unvanquished Resolution: 2560 x 1600 - Effects Quality: Ultra OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 2560 x 1600 - Effects Quality: Ultra d aa b a 50 100 150 200 250 235.0 234.6 234.0 233.9
Unvanquished Resolution: 1920 x 1080 - Effects Quality: Medium OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 1920 x 1080 - Effects Quality: Medium a b d aa 120 240 360 480 600 542.5 542.4 540.7 539.8
Unvanquished Resolution: 1920 x 1200 - Effects Quality: Medium OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 1920 x 1200 - Effects Quality: Medium b aa a d 110 220 330 440 550 517.5 512.5 511.8 504.8
Unvanquished Resolution: 2560 x 1440 - Effects Quality: Medium OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 2560 x 1440 - Effects Quality: Medium b d aa a 80 160 240 320 400 373.5 373.4 373.2 371.2
Unvanquished Resolution: 2560 x 1600 - Effects Quality: Medium OpenBenchmarking.org Frames Per Second, More Is Better Unvanquished 0.55 Resolution: 2560 x 1600 - Effects Quality: Medium d b aa a 70 140 210 280 350 337.0 336.6 336.2 336.0
WarpX Input: Uniform Plasma OpenBenchmarking.org Seconds, Fewer Is Better WarpX 24.10 Input: Uniform Plasma aa b d 5 10 15 20 25 21.18 21.70 21.71 1. (CXX) g++ options: -O3 -lm
WarpX Input: Plasma Acceleration OpenBenchmarking.org Seconds, Fewer Is Better WarpX 24.10 Input: Plasma Acceleration aa b d 13 26 39 52 65 55.37 56.63 60.22 1. (CXX) g++ options: -O3 -lm
Whisperfile Model Size: Tiny OpenBenchmarking.org Seconds, Fewer Is Better Whisperfile 20Aug24 Model Size: Tiny aa b d 13 26 39 52 65 55.19 58.16 59.90
Whisperfile Model Size: Small OpenBenchmarking.org Seconds, Fewer Is Better Whisperfile 20Aug24 Model Size: Small aa b d 50 100 150 200 250 229.33 231.06 243.97
Whisperfile Model Size: Medium OpenBenchmarking.org Seconds, Fewer Is Better Whisperfile 20Aug24 Model Size: Medium aa b d 150 300 450 600 750 655.14 657.64 690.37
XNNPACK Model: FP32MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV1 aa d b 3K 6K 9K 12K 15K 1536 1617 13914 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV2 aa d b 1500 3000 4500 6000 7500 1057 1130 6981 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Large aa d b 2K 4K 6K 8K 10K 1255 1319 9205 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Small aa d b 400 800 1200 1600 2000 465 515 1998 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV1 aa d b 1300 2600 3900 5200 6500 1868 2169 6239 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV2 aa d b 2K 4K 6K 8K 10K 1337 1533 10687 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Large aa d b 2K 4K 6K 8K 10K 1429 1602 11295 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Small aa b d 140 280 420 560 700 581 606 654 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: QS8MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: QS8MobileNetV2 b aa d 130 260 390 520 650 526 533 581 1. (CXX) g++ options: -O3 -lrt -lm
Phoronix Test Suite v10.8.5