eoy2024 Benchmarks for a future article. AMD EPYC 4484PX 12-Core testing with a Supermicro AS-3015A-I H13SAE-MF v1.00 (2.1 BIOS) and ASPEED on Ubuntu 24.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2412083-NE-EOY20246055&sro&grt .
eoy2024 Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Compiler File-System Screen Resolution a 4484PX px AMD EPYC 4564P 16-Core @ 5.88GHz (16 Cores / 32 Threads) Supermicro AS-3015A-I H13SAE-MF v1.00 (2.1 BIOS) AMD Device 14d8 2 x 32GB DRAM-4800MT/s Micron MTC20C2085S1EC48BA1 BC 3201GB Micron_7450_MTFDKCC3T2TFS + 960GB SAMSUNG MZ1L2960HCJR-00A07 ASPEED AMD Rembrandt Radeon HD Audio VA2431 2 x Intel I210 Ubuntu 24.04 6.8.0-11-generic (x86_64) GNOME Shell 45.3 X Server 1.21.1.11 GCC 13.2.0 ext4 1024x768 AMD EPYC 4484PX 12-Core @ 5.66GHz (12 Cores / 24 Threads) 6.12.2-061202-generic (x86_64) OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-fxIygj/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-fxIygj/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - a: Scaling Governor: amd-pstate-epp performance (EPP: performance) - CPU Microcode: 0xa601209 - 4484PX: Scaling Governor: amd-pstate-epp performance (Boost: Enabled EPP: performance) - CPU Microcode: 0xa601209 - px: Scaling Governor: amd-pstate-epp performance (Boost: Enabled EPP: performance) - CPU Microcode: 0xa601209 Java Details - OpenJDK Runtime Environment (build 21.0.2+13-Ubuntu-2) Python Details - Python 3.12.3 Security Details - a: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - 4484PX: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected - px: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
eoy2024 compress-7zip: Compression Rating compress-7zip: Decompression Rating mt-dgemm: Sustained Floating-Point Rate cassandra: Writes couchdb: 100 - 1000 - 30 couchdb: 100 - 3000 - 30 couchdb: 300 - 1000 - 30 couchdb: 300 - 3000 - 30 couchdb: 500 - 1000 - 30 couchdb: 500 - 3000 - 30 astcenc: Fast astcenc: Medium astcenc: Thorough astcenc: Exhaustive astcenc: Very Thorough blender: BMW27 - CPU-Only blender: Junkshop - CPU-Only blender: Classroom - CPU-Only blender: Fishy Cat - CPU-Only blender: Barbershop - CPU-Only blender: Pabellon Barcelona - CPU-Only build2: Time To Compile byte: Pipe byte: Dhrystone 2 byte: System Call byte: Whetstone Double cp2k: H20-64 cp2k: H20-256 cp2k: Fayalite-FIST etcpak: Multi-Threaded - ETC2 financebench: Repo OpenMP financebench: Bonds OpenMP gcrypt: gromacs: water_GMX50_bare litert: DeepLab V3 litert: SqueezeNet litert: Inception V4 litert: NASNet Mobile litert: Mobilenet Float litert: Mobilenet Quant litert: Inception ResNet V2 litert: Quantized COCO SSD MobileNet v1 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048 llamafile: Llama-3.2-3B-Instruct.Q6_K - Text Generation 16 llamafile: Llama-3.2-3B-Instruct.Q6_K - Text Generation 128 llamafile: Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 256 llamafile: Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 512 llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Text Generation 16 llamafile: Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 1024 llamafile: Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 2048 llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Text Generation 128 llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Text Generation 16 llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 256 llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 512 llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Text Generation 128 llamafile: wizardcoder-python-34b-v1.0.Q6_K - Text Generation 16 llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 1024 llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 2048 llamafile: wizardcoder-python-34b-v1.0.Q6_K - Text Generation 128 llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Prompt Processing 256 llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Prompt Processing 512 llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Prompt Processing 1024 llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Prompt Processing 2048 llamafile: wizardcoder-python-34b-v1.0.Q6_K - Prompt Processing 256 llamafile: wizardcoder-python-34b-v1.0.Q6_K - Prompt Processing 512 llamafile: wizardcoder-python-34b-v1.0.Q6_K - Prompt Processing 1024 llamafile: wizardcoder-python-34b-v1.0.Q6_K - Prompt Processing 2048 namd: ATPase with 327,506 Atoms namd: STMV with 1,066,628 Atoms numpy: onednn: IP Shapes 1D - CPU onednn: IP Shapes 3D - CPU onednn: Convolution Batch Shapes Auto - CPU onednn: Deconvolution Batch shapes_1d - CPU onednn: Deconvolution Batch shapes_3d - CPU onednn: Recurrent Neural Network Training - CPU onednn: Recurrent Neural Network Inference - CPU onnx: GPT-2 - CPU - Standard onnx: GPT-2 - CPU - Standard onnx: yolov4 - CPU - Standard onnx: yolov4 - CPU - Standard onnx: ZFNet-512 - CPU - Standard onnx: ZFNet-512 - CPU - Standard onnx: T5 Encoder - CPU - Standard onnx: T5 Encoder - CPU - Standard onnx: bertsquad-12 - CPU - Standard onnx: bertsquad-12 - CPU - Standard onnx: CaffeNet 12-int8 - CPU - Standard onnx: CaffeNet 12-int8 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Standard onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: super-resolution-10 - CPU - Standard onnx: super-resolution-10 - CPU - Standard onnx: ResNet101_DUC_HDC-12 - CPU - Standard onnx: ResNet101_DUC_HDC-12 - CPU - Standard onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard openssl: ChaCha20 openssl: AES-128-GCM openssl: AES-256-GCM openssl: ChaCha20-Poly1305 openvino-genai: Gemma-7b-int4-ov - CPU openvino-genai: Gemma-7b-int4-ov - CPU - Time To First Token openvino-genai: Gemma-7b-int4-ov - CPU - Time Per Output Token openvino-genai: Falcon-7b-instruct-int4-ov - CPU openvino-genai: Falcon-7b-instruct-int4-ov - CPU - Time To First Token openvino-genai: Falcon-7b-instruct-int4-ov - CPU - Time Per Output Token openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU - Time To First Token openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU - Time Per Output Token ospray: particle_volume/ao/real_time ospray: particle_volume/scivis/real_time ospray: particle_volume/pathtracer/real_time ospray: gravity_spheres_volume/dim_512/ao/real_time ospray: gravity_spheres_volume/dim_512/scivis/real_time ospray: gravity_spheres_volume/dim_512/pathtracer/real_time povray: Trace Time primesieve: 1e12 primesieve: 1e13 pyperformance: go pyperformance: chaos pyperformance: float pyperformance: nbody pyperformance: pathlib pyperformance: raytrace pyperformance: xml_etree pyperformance: gc_collect pyperformance: json_loads pyperformance: crypto_pyaes pyperformance: async_tree_io pyperformance: regex_compile pyperformance: python_startup pyperformance: asyncio_tcp_ssl pyperformance: django_template pyperformance: asyncio_websockets pyperformance: pickle_pure_python quantlib: S quantlib: XXS relion: Basic - CPU renaissance: Scala Dotty renaissance: Rand Forest renaissance: ALS Movie Lens renaissance: Apache Spark Bayes renaissance: Savina Reactors.IO renaissance: Apache Spark PageRank renaissance: Finagle HTTP Requests renaissance: Gaussian Mixture Model renaissance: In-Memory Database Shootout renaissance: Akka Unbalanced Cobwebbed Tree renaissance: Genetic Algorithm Using Jenetics + Futures rustls: handshake - TLS13_CHACHA20_POLY1305_SHA256 rustls: handshake - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 rustls: handshake-resume - TLS13_CHACHA20_POLY1305_SHA256 rustls: handshake-ticket - TLS13_CHACHA20_POLY1305_SHA256 rustls: handshake - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 rustls: handshake-resume - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 rustls: handshake-ticket - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 rustls: handshake-resume - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 rustls: handshake-ticket - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 simdjson: Kostya simdjson: TopTweet simdjson: LargeRand simdjson: PartialTweets simdjson: DistinctUserID stockfish: Chess Benchmark stockfish: Chess Benchmark svt-av1: Preset 3 - Bosphorus 4K svt-av1: Preset 5 - Bosphorus 4K svt-av1: Preset 8 - Bosphorus 4K svt-av1: Preset 13 - Bosphorus 4K svt-av1: Preset 3 - Bosphorus 1080p svt-av1: Preset 5 - Bosphorus 1080p svt-av1: Preset 8 - Bosphorus 1080p svt-av1: Preset 13 - Bosphorus 1080p svt-av1: Preset 3 - Beauty 4K 10-bit svt-av1: Preset 5 - Beauty 4K 10-bit svt-av1: Preset 8 - Beauty 4K 10-bit svt-av1: Preset 13 - Beauty 4K 10-bit build-eigen: Time To Compile whisper-cpp: ggml-base.en - 2016 State of the Union whisper-cpp: ggml-small.en - 2016 State of the Union whisper-cpp: ggml-medium.en - 2016 State of the Union whisperfile: Tiny whisperfile: Small whisperfile: Medium x265: Bosphorus 4K x265: Bosphorus 1080p xnnpack: FP32MobileNetV1 xnnpack: FP32MobileNetV2 xnnpack: FP32MobileNetV3Large xnnpack: FP32MobileNetV3Small xnnpack: FP16MobileNetV1 xnnpack: FP16MobileNetV2 xnnpack: FP16MobileNetV3Large xnnpack: FP16MobileNetV3Small xnnpack: QS8MobileNetV2 y-cruncher: 1B y-cruncher: 500M a 4484PX px 163859 165916 1141.194104 271333 69.929 232.188 106.13 367.83 148.049 511.775 396.6495 156.2217 20.3025 1.6844 2.741 53.55 73.56 143.36 71.35 506.2 166.12 92.053 48806257.1 1866536062.7 49140426.6 343491.9 58.191 592.857 94.032 577.817 21418.445312 33061.21875 162.125 1.692 3579.67 1794.11 21477.8 16936 1211.48 823.17 19530.2 2129.52 6.88 70.76 70.85 63.09 7.24 68.4 69.26 62.97 47.72 327.3 355.09 279.04 19.03 20.13 4096 8192 24.59 16384 32768 26.28 10.22 4096 8192 10.47 1.78 16384 32768 1.99 4096 8192 16384 32768 1536 3072 6144 12288 2.79632 0.75656 775.75 1.12573 4.058 6.67287 2.97612 2.41294 1372.03 700.859 134.596 7.42776 11.0552 90.4523 102.331 9.76985 156.453 6.39112 15.5899 64.141 636.318 1.57084 3.2167 310.875 42.4537 23.553 390.597 2.55898 141.117 7.08601 1.54196 648.522 47.0691 21.2429 130588495050 104784522170 97172751700 92393529340 9.83 106.62 101.72 12.93 86.06 77.34 19.28 55.93 51.86 9.00917 8.98486 236.245 7.63944 7.58789 8.82093 18.542 6.347 78.498 77.8 38.2 50.7 59 14.2 175 35.8 677 12.1 41.7 755 69.8 5.77 645 20.7 315 165 12.7476 13.432 944.27 477.0 414.4 9805.7 490.0 3506.4 2412.2 2319.4 3399.5 3256.1 4403.8 732.8 76454.45 80462.6 388077.69 404263.45 423535.68 3563852.57 2620332 1820810.21 1553632.14 5.97 10.46 1.83 9.76 10.46 46507038 54752796 9.59 34.538 102.005 212.52 29.573 101.971 339.023 842.558 1.422 6.504 12.468 18.588 58.655 87.48973 245.07838 700.91 41.70935 195.41642 534.919 32.57 114.45 1252 1495 1810 979 1143 1190 1498 920 844 18.485 8.772 141263 125698 842.730642 174960 75.901 253.99 117.566 406.12 164.468 559.346 278.2445 109.0265 14.17 1.1887 1.9412 74.08 97.01 197.2 96.67 679.34 226.34 111.651 33443359.2 1346521770.3 30761218.9 244075.3 53.005 628.104 92.21 410.726 22320.332031 34600.773438 171.023 1.577 2343.38 1809.18 22083.3 8057.56 1244.7 848.943 19477.8 1420.15 7.11 69.11 66.57 63.8 7.41 68.2 66.85 63.61 52.3 243.14 232.26 222.75 19.49 20.39 4096 8192 25.86 16384 32768 27.59 10.45 4096 8192 10.91 1.83 16384 32768 2.05 4096 8192 16384 32768 1536 3072 6144 12288 2.38124 0.65119 745.59 1.93806 2.73072 4.11551 3.40293 3.5084 1898.36 965.015 159.71 6.25815 10.7338 93.1605 110.94 9.01322 208.174 4.80287 14.5122 68.9051 941.401 1.06188 2.81093 355.751 37.3832 26.7478 356.409 2.80544 125.172 7.98873 1.17627 850.141 40.0935 24.9402 97105235690 76496336760 71160291870 68816544020 10.23 121.48 97.79 13.4 93.01 74.65 20.28 58.91 49.31 6.52776 6.44913 199.023 5.63122 5.54888 6.41198 25.264 9.116 110.608 78.6 39.7 51.3 59.5 14.4 182 36.8 699 12.4 43.1 666 71.7 6.08 590 21 321 169 11.8647 12.1169 729.4 428.6 422.0 9378.8 513.2 3655.8 2138.1 2492.2 3860.6 3241.5 4038.4 904.0 57716.64 59308.75 333882.92 344296.24 306153.2 3035330.21 2282729.64 1586292.42 1329363.1 6.11 10.82 1.84 10.1 10.76 33702298 45267546 7.684 29.094 85.201 198.112 25.446 88.415 287.047 776.115 1.188 5.602 10.967 17.406 67.364 92.70933 268.23891 809.78969 37.13462 173.38197 473.55091 27.16 101.37 1257 1365 1515 809 1383 1217 1467 779 717 18.379 8.688 142213 125605 842.012831 173946 76.389 254.733 119.349 408.483 164.812 560.7 277.2994 108.8588 14.1464 1.1862 1.9391 73.16 97.1 197.53 97.09 678.4 224.64 113.78 33381363.1 1340340196.6 30701622.8 244131 52.724 631.31 94.896 409.875 22318.738281 34896.835938 163.839 1.575 2359.99 1821.35 22752.4 7931.64 1244.51 849.209 19490.7 1417.35 7.12 67.95 66.35 63.79 7.44 68.81 66.52 63.41 52.37 232.86 244.77 208.99 19.5 20.51 4096 8192 25.94 16384 32768 27.8 10.45 4096 8192 10.93 1.84 16384 32768 2.05 4096 8192 16384 32768 1536 3072 6144 12288 2.35379 0.65448 831.42 1.93913 2.72942 4.13321 3.40628 3.51243 1895.68 966.013 157.893 6.33034 10.7127 93.3441 110.892 9.01687 206.091 4.85142 14.5747 68.6104 937.778 1.066 2.79638 357.602 37.1048 26.9485 356.194 2.80695 125.076 7.99486 1.1705 854.334 43.362 23.0604 97019897450 76184405610 70902656480 68678955550 10.24 122.3 97.61 13.41 93 74.54 20.29 58.86 49.28 6.52206 6.52304 197.2 5.71084 5.6147 6.4074 25.328 9.147 110.709 79.4 39.4 50.8 59.2 14.4 182 36.5 706 12.5 43.3 656 72.5 6.09 590 21.2 322 168 11.839 12.1057 733.02 436.2 453.2 9275.7 474.9 3676.0 2229.7 2483.1 3815.2 3175.6 4002.3 920.7 57688.08 59206.34 333574.3 342775.29 304060.28 3038723.48 2292879.44 1572010.68 1340712.85 5.45 10.51 1.84 8.35 8.97 33871595 42973396 7.646 28.824 84.998 194.024 25.447 88.27 286.962 769.818 1.184 5.551 10.855 17.355 67.076 93.45463 266.81425 809.489 38.71828 167.89219 475.51084 26.94 101.25 1272 1368 1574 837 1386 1248 1527 798 723 18.365 8.623 OpenBenchmarking.org
7-Zip Compression Test: Compression Rating OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression Test: Compression Rating 4484PX a px 40K 80K 120K 160K 200K 141263 163859 142213 1. 7-Zip 23.01 (x64) : Copyright (c) 1999-2023 Igor Pavlov : 2023-06-20
7-Zip Compression Test: Decompression Rating OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression Test: Decompression Rating 4484PX a px 40K 80K 120K 160K 200K 125698 165916 125605 1. 7-Zip 23.01 (x64) : Copyright (c) 1999-2023 Igor Pavlov : 2023-06-20
ACES DGEMM Sustained Floating-Point Rate OpenBenchmarking.org GFLOP/s, More Is Better ACES DGEMM 1.0 Sustained Floating-Point Rate 4484PX a px 200 400 600 800 1000 842.73 1141.19 842.01 1. (CC) gcc options: -ffast-math -mavx2 -O3 -fopenmp -lopenblas
Apache Cassandra Test: Writes OpenBenchmarking.org Op/s, More Is Better Apache Cassandra 5.0 Test: Writes 4484PX a px 60K 120K 180K 240K 300K 174960 271333 173946
Apache CouchDB Bulk Size: 100 - Inserts: 1000 - Rounds: 30 OpenBenchmarking.org Seconds, Fewer Is Better Apache CouchDB 3.4.1 Bulk Size: 100 - Inserts: 1000 - Rounds: 30 4484PX a px 20 40 60 80 100 75.90 69.93 76.39 1. (CXX) g++ options: -flto -lstdc++ -shared -lei
Apache CouchDB Bulk Size: 100 - Inserts: 3000 - Rounds: 30 OpenBenchmarking.org Seconds, Fewer Is Better Apache CouchDB 3.4.1 Bulk Size: 100 - Inserts: 3000 - Rounds: 30 4484PX a px 60 120 180 240 300 253.99 232.19 254.73 1. (CXX) g++ options: -flto -lstdc++ -shared -lei
Apache CouchDB Bulk Size: 300 - Inserts: 1000 - Rounds: 30 OpenBenchmarking.org Seconds, Fewer Is Better Apache CouchDB 3.4.1 Bulk Size: 300 - Inserts: 1000 - Rounds: 30 4484PX a px 30 60 90 120 150 117.57 106.13 119.35 1. (CXX) g++ options: -flto -lstdc++ -shared -lei
Apache CouchDB Bulk Size: 300 - Inserts: 3000 - Rounds: 30 OpenBenchmarking.org Seconds, Fewer Is Better Apache CouchDB 3.4.1 Bulk Size: 300 - Inserts: 3000 - Rounds: 30 4484PX a px 90 180 270 360 450 406.12 367.83 408.48 1. (CXX) g++ options: -flto -lstdc++ -shared -lei
Apache CouchDB Bulk Size: 500 - Inserts: 1000 - Rounds: 30 OpenBenchmarking.org Seconds, Fewer Is Better Apache CouchDB 3.4.1 Bulk Size: 500 - Inserts: 1000 - Rounds: 30 4484PX a px 40 80 120 160 200 164.47 148.05 164.81 1. (CXX) g++ options: -flto -lstdc++ -shared -lei
Apache CouchDB Bulk Size: 500 - Inserts: 3000 - Rounds: 30 OpenBenchmarking.org Seconds, Fewer Is Better Apache CouchDB 3.4.1 Bulk Size: 500 - Inserts: 3000 - Rounds: 30 4484PX a px 120 240 360 480 600 559.35 511.78 560.70 1. (CXX) g++ options: -flto -lstdc++ -shared -lei
ASTC Encoder Preset: Fast OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Fast 4484PX a px 90 180 270 360 450 278.24 396.65 277.30 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Medium OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Medium 4484PX a px 30 60 90 120 150 109.03 156.22 108.86 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Thorough OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Thorough 4484PX a px 5 10 15 20 25 14.17 20.30 14.15 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Exhaustive 4484PX a px 0.379 0.758 1.137 1.516 1.895 1.1887 1.6844 1.1862 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Very Thorough OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Very Thorough 4484PX a px 0.6167 1.2334 1.8501 2.4668 3.0835 1.9412 2.7410 1.9391 1. (CXX) g++ options: -O3 -flto -pthread
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: BMW27 - Compute: CPU-Only 4484PX a px 16 32 48 64 80 74.08 53.55 73.16
Blender Blend File: Junkshop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Junkshop - Compute: CPU-Only 4484PX a px 20 40 60 80 100 97.01 73.56 97.10
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Classroom - Compute: CPU-Only 4484PX a px 40 80 120 160 200 197.20 143.36 197.53
Blender Blend File: Fishy Cat - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Fishy Cat - Compute: CPU-Only 4484PX a px 20 40 60 80 100 96.67 71.35 97.09
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Barbershop - Compute: CPU-Only 4484PX a px 150 300 450 600 750 679.34 506.20 678.40
Blender Blend File: Pabellon Barcelona - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Pabellon Barcelona - Compute: CPU-Only 4484PX a px 50 100 150 200 250 226.34 166.12 224.64
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.17 Time To Compile 4484PX a px 30 60 90 120 150 111.65 92.05 113.78
BYTE Unix Benchmark Computational Test: Pipe OpenBenchmarking.org LPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: Pipe 4484PX a px 10M 20M 30M 40M 50M 33443359.2 48806257.1 33381363.1 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
BYTE Unix Benchmark Computational Test: Dhrystone 2 OpenBenchmarking.org LPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: Dhrystone 2 4484PX a px 400M 800M 1200M 1600M 2000M 1346521770.3 1866536062.7 1340340196.6 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
BYTE Unix Benchmark Computational Test: System Call OpenBenchmarking.org LPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: System Call 4484PX a px 11M 22M 33M 44M 55M 30761218.9 49140426.6 30701622.8 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
BYTE Unix Benchmark Computational Test: Whetstone Double OpenBenchmarking.org MWIPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: Whetstone Double 4484PX a px 70K 140K 210K 280K 350K 244075.3 343491.9 244131.0 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
CP2K Molecular Dynamics Input: H20-64 OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2024.3 Input: H20-64 4484PX a px 13 26 39 52 65 53.01 58.19 52.72 1. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
CP2K Molecular Dynamics Input: H20-256 OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2024.3 Input: H20-256 4484PX a px 140 280 420 560 700 628.10 592.86 631.31 1. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
CP2K Molecular Dynamics Input: Fayalite-FIST OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2024.3 Input: Fayalite-FIST 4484PX a px 20 40 60 80 100 92.21 94.03 94.90 1. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
Etcpak Benchmark: Multi-Threaded - Configuration: ETC2 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 2.0 Benchmark: Multi-Threaded - Configuration: ETC2 4484PX a px 120 240 360 480 600 410.73 577.82 409.88 1. (CXX) g++ options: -flto -pthread
FinanceBench Benchmark: Repo OpenMP OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-07-25 Benchmark: Repo OpenMP 4484PX a px 5K 10K 15K 20K 25K 22320.33 21418.45 22318.74 1. (CXX) g++ options: -O3 -march=native -fopenmp
FinanceBench Benchmark: Bonds OpenMP OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-07-25 Benchmark: Bonds OpenMP 4484PX a px 7K 14K 21K 28K 35K 34600.77 33061.22 34896.84 1. (CXX) g++ options: -O3 -march=native -fopenmp
Gcrypt Library OpenBenchmarking.org Seconds, Fewer Is Better Gcrypt Library 1.10.3 4484PX a px 40 80 120 160 200 171.02 162.13 163.84 1. (CC) gcc options: -O2 -fvisibility=hidden
GROMACS Input: water_GMX50_bare OpenBenchmarking.org Ns Per Day, More Is Better GROMACS Input: water_GMX50_bare 4484PX a px 0.3807 0.7614 1.1421 1.5228 1.9035 1.577 1.692 1.575 1. GROMACS version: 2023.3-Ubuntu_2023.3_1ubuntu3
LiteRT Model: DeepLab V3 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: DeepLab V3 4484PX a px 800 1600 2400 3200 4000 2343.38 3579.67 2359.99
LiteRT Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: SqueezeNet 4484PX a px 400 800 1200 1600 2000 1809.18 1794.11 1821.35
LiteRT Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception V4 4484PX a px 5K 10K 15K 20K 25K 22083.3 21477.8 22752.4
LiteRT Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: NASNet Mobile 4484PX a px 4K 8K 12K 16K 20K 8057.56 16936.00 7931.64
LiteRT Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Float 4484PX a px 300 600 900 1200 1500 1244.70 1211.48 1244.51
LiteRT Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Quant 4484PX a px 200 400 600 800 1000 848.94 823.17 849.21
LiteRT Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception ResNet V2 4484PX a px 4K 8K 12K 16K 20K 19477.8 19530.2 19490.7
LiteRT Model: Quantized COCO SSD MobileNet v1 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Quantized COCO SSD MobileNet v1 4484PX a px 500 1000 1500 2000 2500 1420.15 2129.52 1417.35
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 4484PX a px 2 4 6 8 10 7.11 6.88 7.12 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 4484PX a px 16 32 48 64 80 69.11 70.76 67.95 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 4484PX a px 16 32 48 64 80 66.57 70.85 66.35 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 4484PX a px 14 28 42 56 70 63.80 63.09 63.79 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 4484PX a px 2 4 6 8 10 7.41 7.24 7.44 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 4484PX a px 15 30 45 60 75 68.20 68.40 68.81 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 4484PX a px 15 30 45 60 75 66.85 69.26 66.52 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 4484PX a px 14 28 42 56 70 63.61 62.97 63.41 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 4484PX a px 12 24 36 48 60 52.30 47.72 52.37 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 4484PX a px 70 140 210 280 350 243.14 327.30 232.86 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 4484PX a px 80 160 240 320 400 232.26 355.09 244.77 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 4484PX a px 60 120 180 240 300 222.75 279.04 208.99 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llamafile Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 16 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 16 4484PX a px 5 10 15 20 25 19.49 19.03 19.50
Llamafile Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 128 4484PX a px 5 10 15 20 25 20.39 20.13 20.51
Llamafile Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 256 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 256 4484PX a px 900 1800 2700 3600 4500 4096 4096 4096
Llamafile Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 512 4484PX a px 2K 4K 6K 8K 10K 8192 8192 8192
Llamafile Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 16 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 16 4484PX a px 6 12 18 24 30 25.86 24.59 25.94
Llamafile Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 1024 4484PX a px 4K 8K 12K 16K 20K 16384 16384 16384
Llamafile Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 2048 4484PX a px 7K 14K 21K 28K 35K 32768 32768 32768
Llamafile Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 128 4484PX a px 7 14 21 28 35 27.59 26.28 27.80
Llamafile Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 16 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 16 4484PX a px 3 6 9 12 15 10.45 10.22 10.45
Llamafile Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 256 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 256 4484PX a px 900 1800 2700 3600 4500 4096 4096 4096
Llamafile Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 512 4484PX a px 2K 4K 6K 8K 10K 8192 8192 8192
Llamafile Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 128 4484PX a px 3 6 9 12 15 10.91 10.47 10.93
Llamafile Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 16 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 16 4484PX a px 0.414 0.828 1.242 1.656 2.07 1.83 1.78 1.84
Llamafile Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 1024 4484PX a px 4K 8K 12K 16K 20K 16384 16384 16384
Llamafile Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 2048 4484PX a px 7K 14K 21K 28K 35K 32768 32768 32768
Llamafile Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 128 4484PX a px 0.4613 0.9226 1.3839 1.8452 2.3065 2.05 1.99 2.05
Llamafile Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 256 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 256 4484PX a px 900 1800 2700 3600 4500 4096 4096 4096
Llamafile Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 512 4484PX a px 2K 4K 6K 8K 10K 8192 8192 8192
Llamafile Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 1024 4484PX a px 4K 8K 12K 16K 20K 16384 16384 16384
Llamafile Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 2048 4484PX a px 7K 14K 21K 28K 35K 32768 32768 32768
Llamafile Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 256 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 256 4484PX a px 300 600 900 1200 1500 1536 1536 1536
Llamafile Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 512 4484PX a px 700 1400 2100 2800 3500 3072 3072 3072
Llamafile Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 1024 4484PX a px 1300 2600 3900 5200 6500 6144 6144 6144
Llamafile Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 2048 4484PX a px 3K 6K 9K 12K 15K 12288 12288 12288
NAMD Input: ATPase with 327,506 Atoms OpenBenchmarking.org ns/day, More Is Better NAMD 3.0 Input: ATPase with 327,506 Atoms 4484PX a px 0.6292 1.2584 1.8876 2.5168 3.146 2.38124 2.79632 2.35379
NAMD Input: STMV with 1,066,628 Atoms OpenBenchmarking.org ns/day, More Is Better NAMD 3.0 Input: STMV with 1,066,628 Atoms 4484PX a px 0.1702 0.3404 0.5106 0.6808 0.851 0.65119 0.75656 0.65448
Numpy Benchmark OpenBenchmarking.org Score, More Is Better Numpy Benchmark 4484PX a px 200 400 600 800 1000 745.59 775.75 831.42
oneDNN Harness: IP Shapes 1D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 1D - Engine: CPU 4484PX a px 0.4363 0.8726 1.3089 1.7452 2.1815 1.93806 1.12573 1.93913 MIN: 1.92 MIN: 1.03 MIN: 1.91 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: IP Shapes 3D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 3D - Engine: CPU 4484PX a px 0.9131 1.8262 2.7393 3.6524 4.5655 2.73072 4.05800 2.72942 MIN: 2.7 MIN: 3.75 MIN: 2.7 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Convolution Batch Shapes Auto - Engine: CPU 4484PX a px 2 4 6 8 10 4.11551 6.67287 4.13321 MIN: 4.05 MIN: 6.2 MIN: 4.07 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_1d - Engine: CPU 4484PX a px 0.7664 1.5328 2.2992 3.0656 3.832 3.40293 2.97612 3.40628 MIN: 3.03 MIN: 2.42 MIN: 3.03 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_3d - Engine: CPU 4484PX a px 0.7903 1.5806 2.3709 3.1612 3.9515 3.50840 2.41294 3.51243 MIN: 3.46 MIN: 2.34 MIN: 3.47 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Recurrent Neural Network Training - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Training - Engine: CPU 4484PX a px 400 800 1200 1600 2000 1898.36 1372.03 1895.68 MIN: 1894.26 MIN: 1342.06 MIN: 1892.59 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Inference - Engine: CPU 4484PX a px 200 400 600 800 1000 965.02 700.86 966.01 MIN: 963.27 MIN: 679.89 MIN: 963.43 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: GPT-2 - Device: CPU - Executor: Standard 4484PX a px 40 80 120 160 200 159.71 134.60 157.89 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: GPT-2 - Device: CPU - Executor: Standard 4484PX a px 2 4 6 8 10 6.25815 7.42776 6.33034 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: yolov4 - Device: CPU - Executor: Standard 4484PX a px 3 6 9 12 15 10.73 11.06 10.71 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: yolov4 - Device: CPU - Executor: Standard 4484PX a px 20 40 60 80 100 93.16 90.45 93.34 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ZFNet-512 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ZFNet-512 - Device: CPU - Executor: Standard 4484PX a px 20 40 60 80 100 110.94 102.33 110.89 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ZFNet-512 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ZFNet-512 - Device: CPU - Executor: Standard 4484PX a px 3 6 9 12 15 9.01322 9.76985 9.01687 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: T5 Encoder - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: T5 Encoder - Device: CPU - Executor: Standard 4484PX a px 50 100 150 200 250 208.17 156.45 206.09 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: T5 Encoder - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: T5 Encoder - Device: CPU - Executor: Standard 4484PX a px 2 4 6 8 10 4.80287 6.39112 4.85142 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: bertsquad-12 - Device: CPU - Executor: Standard 4484PX a px 4 8 12 16 20 14.51 15.59 14.57 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: bertsquad-12 - Device: CPU - Executor: Standard 4484PX a px 15 30 45 60 75 68.91 64.14 68.61 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard 4484PX a px 200 400 600 800 1000 941.40 636.32 937.78 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard 4484PX a px 0.3534 0.7068 1.0602 1.4136 1.767 1.06188 1.57084 1.06600 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard 4484PX a px 0.7238 1.4476 2.1714 2.8952 3.619 2.81093 3.21670 2.79638 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard 4484PX a px 80 160 240 320 400 355.75 310.88 357.60 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard 4484PX a px 10 20 30 40 50 37.38 42.45 37.10 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard 4484PX a px 6 12 18 24 30 26.75 23.55 26.95 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard 4484PX a px 80 160 240 320 400 356.41 390.60 356.19 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard 4484PX a px 0.6316 1.2632 1.8948 2.5264 3.158 2.80544 2.55898 2.80695 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: super-resolution-10 - Device: CPU - Executor: Standard 4484PX a px 30 60 90 120 150 125.17 141.12 125.08 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: super-resolution-10 - Device: CPU - Executor: Standard 4484PX a px 2 4 6 8 10 7.98873 7.08601 7.99486 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard 4484PX a px 0.3469 0.6938 1.0407 1.3876 1.7345 1.17627 1.54196 1.17050 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard 4484PX a px 200 400 600 800 1000 850.14 648.52 854.33 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard 4484PX a px 11 22 33 44 55 40.09 47.07 43.36 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard 4484PX a px 6 12 18 24 30 24.94 21.24 23.06 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenSSL Algorithm: ChaCha20 OpenBenchmarking.org byte/s, More Is Better OpenSSL Algorithm: ChaCha20 4484PX a px 30000M 60000M 90000M 120000M 150000M 97105235690 130588495050 97019897450 1. OpenSSL 3.0.13 30 Jan 2024 (Library: OpenSSL 3.0.13 30 Jan 2024) - Additional Parameters: -engine qatengine -async_jobs 8
OpenSSL Algorithm: AES-128-GCM OpenBenchmarking.org byte/s, More Is Better OpenSSL Algorithm: AES-128-GCM 4484PX a px 20000M 40000M 60000M 80000M 100000M 76496336760 104784522170 76184405610 1. OpenSSL 3.0.13 30 Jan 2024 (Library: OpenSSL 3.0.13 30 Jan 2024) - Additional Parameters: -engine qatengine -async_jobs 8
OpenSSL Algorithm: AES-256-GCM OpenBenchmarking.org byte/s, More Is Better OpenSSL Algorithm: AES-256-GCM 4484PX a px 20000M 40000M 60000M 80000M 100000M 71160291870 97172751700 70902656480 1. OpenSSL 3.0.13 30 Jan 2024 (Library: OpenSSL 3.0.13 30 Jan 2024) - Additional Parameters: -engine qatengine -async_jobs 8
OpenSSL Algorithm: ChaCha20-Poly1305 OpenBenchmarking.org byte/s, More Is Better OpenSSL Algorithm: ChaCha20-Poly1305 4484PX a px 20000M 40000M 60000M 80000M 100000M 68816544020 92393529340 68678955550 1. OpenSSL 3.0.13 30 Jan 2024 (Library: OpenSSL 3.0.13 30 Jan 2024) - Additional Parameters: -engine qatengine -async_jobs 8
OpenVINO GenAI Model: Gemma-7b-int4-ov - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: Gemma-7b-int4-ov - Device: CPU 4484PX a px 3 6 9 12 15 10.23 9.83 10.24
OpenVINO GenAI Model: Gemma-7b-int4-ov - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Gemma-7b-int4-ov - Device: CPU - Time To First Token 4484PX a px 30 60 90 120 150 121.48 106.62 122.30
OpenVINO GenAI Model: Gemma-7b-int4-ov - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Gemma-7b-int4-ov - Device: CPU - Time Per Output Token 4484PX a px 20 40 60 80 100 97.79 101.72 97.61
OpenVINO GenAI Model: Falcon-7b-instruct-int4-ov - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: Falcon-7b-instruct-int4-ov - Device: CPU 4484PX a px 3 6 9 12 15 13.40 12.93 13.41
OpenVINO GenAI Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time To First Token 4484PX a px 20 40 60 80 100 93.01 86.06 93.00
OpenVINO GenAI Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time Per Output Token 4484PX a px 20 40 60 80 100 74.65 77.34 74.54
OpenVINO GenAI Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU 4484PX a px 5 10 15 20 25 20.28 19.28 20.29
OpenVINO GenAI Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time To First Token 4484PX a px 13 26 39 52 65 58.91 55.93 58.86
OpenVINO GenAI Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time Per Output Token 4484PX a px 12 24 36 48 60 49.31 51.86 49.28
OSPRay Benchmark: particle_volume/ao/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.2 Benchmark: particle_volume/ao/real_time 4484PX a px 3 6 9 12 15 6.52776 9.00917 6.52206
OSPRay Benchmark: particle_volume/scivis/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.2 Benchmark: particle_volume/scivis/real_time 4484PX a px 3 6 9 12 15 6.44913 8.98486 6.52304
OSPRay Benchmark: particle_volume/pathtracer/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.2 Benchmark: particle_volume/pathtracer/real_time 4484PX a px 50 100 150 200 250 199.02 236.25 197.20
OSPRay Benchmark: gravity_spheres_volume/dim_512/ao/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.2 Benchmark: gravity_spheres_volume/dim_512/ao/real_time 4484PX a px 2 4 6 8 10 5.63122 7.63944 5.71084
OSPRay Benchmark: gravity_spheres_volume/dim_512/scivis/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.2 Benchmark: gravity_spheres_volume/dim_512/scivis/real_time 4484PX a px 2 4 6 8 10 5.54888 7.58789 5.61470
OSPRay Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.2 Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time 4484PX a px 2 4 6 8 10 6.41198 8.82093 6.40740
POV-Ray Trace Time OpenBenchmarking.org Seconds, Fewer Is Better POV-Ray Trace Time 4484PX a px 6 12 18 24 30 25.26 18.54 25.33 1. POV-Ray 3.7.0.10.unofficial
Primesieve Length: 1e12 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.6 Length: 1e12 4484PX a px 3 6 9 12 15 9.116 6.347 9.147 1. (CXX) g++ options: -O3
Primesieve Length: 1e13 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.6 Length: 1e13 4484PX a px 20 40 60 80 100 110.61 78.50 110.71 1. (CXX) g++ options: -O3
PyPerformance Benchmark: go OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.11 Benchmark: go 4484PX a px 20 40 60 80 100 78.6 77.8 79.4
PyPerformance Benchmark: chaos OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.11 Benchmark: chaos 4484PX a px 9 18 27 36 45 39.7 38.2 39.4
PyPerformance Benchmark: float OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.11 Benchmark: float 4484PX a px 12 24 36 48 60 51.3 50.7 50.8
PyPerformance Benchmark: nbody OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.11 Benchmark: nbody 4484PX a px 13 26 39 52 65 59.5 59.0 59.2
PyPerformance Benchmark: pathlib OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.11 Benchmark: pathlib 4484PX a px 4 8 12 16 20 14.4 14.2 14.4
PyPerformance Benchmark: raytrace OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.11 Benchmark: raytrace 4484PX a px 40 80 120 160 200 182 175 182
PyPerformance Benchmark: xml_etree OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.11 Benchmark: xml_etree 4484PX a px 8 16 24 32 40 36.8 35.8 36.5
PyPerformance Benchmark: gc_collect OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.11 Benchmark: gc_collect 4484PX a px 150 300 450 600 750 699 677 706
PyPerformance Benchmark: json_loads OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.11 Benchmark: json_loads 4484PX a px 3 6 9 12 15 12.4 12.1 12.5
PyPerformance Benchmark: crypto_pyaes OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.11 Benchmark: crypto_pyaes 4484PX a px 10 20 30 40 50 43.1 41.7 43.3
PyPerformance Benchmark: async_tree_io OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.11 Benchmark: async_tree_io 4484PX a px 160 320 480 640 800 666 755 656
PyPerformance Benchmark: regex_compile OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.11 Benchmark: regex_compile 4484PX a px 16 32 48 64 80 71.7 69.8 72.5
PyPerformance Benchmark: python_startup OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.11 Benchmark: python_startup 4484PX a px 2 4 6 8 10 6.08 5.77 6.09
PyPerformance Benchmark: asyncio_tcp_ssl OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.11 Benchmark: asyncio_tcp_ssl 4484PX a px 140 280 420 560 700 590 645 590
PyPerformance Benchmark: django_template OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.11 Benchmark: django_template 4484PX a px 5 10 15 20 25 21.0 20.7 21.2
PyPerformance Benchmark: asyncio_websockets OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.11 Benchmark: asyncio_websockets 4484PX a px 70 140 210 280 350 321 315 322
PyPerformance Benchmark: pickle_pure_python OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.11 Benchmark: pickle_pure_python 4484PX a px 40 80 120 160 200 169 165 168
QuantLib Size: S OpenBenchmarking.org tasks/s, More Is Better QuantLib 1.35-dev Size: S 4484PX a px 3 6 9 12 15 11.86 12.75 11.84 1. (CXX) g++ options: -O3 -march=native -fPIE -pie
QuantLib Size: XXS OpenBenchmarking.org tasks/s, More Is Better QuantLib 1.35-dev Size: XXS 4484PX a px 3 6 9 12 15 12.12 13.43 12.11 1. (CXX) g++ options: -O3 -march=native -fPIE -pie
RELION Test: Basic - Device: CPU OpenBenchmarking.org Seconds, Fewer Is Better RELION 5.0 Test: Basic - Device: CPU 4484PX a px 200 400 600 800 1000 729.40 944.27 733.02 1. (CXX) g++ options: -fPIC -std=c++14 -fopenmp -O3 -rdynamic -lfftw3f -lfftw3 -ldl -ltiff -lpng -ljpeg -lmpi_cxx -lmpi
Renaissance Test: Scala Dotty OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Scala Dotty 4484PX a px 100 200 300 400 500 428.6 477.0 436.2 MIN: 378.22 / MAX: 628.77 MIN: 371.54 / MAX: 736.5 MIN: 380.62 / MAX: 721.56
Renaissance Test: Random Forest OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Random Forest 4484PX a px 100 200 300 400 500 422.0 414.4 453.2 MIN: 357.91 / MAX: 497.55 MIN: 322.79 / MAX: 466.1 MIN: 352.31 / MAX: 513.31
Renaissance Test: ALS Movie Lens OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: ALS Movie Lens 4484PX a px 2K 4K 6K 8K 10K 9378.8 9805.7 9275.7 MIN: 8718.36 / MAX: 9413.7 MIN: 9253.4 / MAX: 10057.61 MIN: 8821.09 / MAX: 9495.91
Renaissance Test: Apache Spark Bayes OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Apache Spark Bayes 4484PX a px 110 220 330 440 550 513.2 490.0 474.9 MIN: 453.66 / MAX: 554.7 MIN: 459.29 / MAX: 580.9 MIN: 454.77 / MAX: 514.32
Renaissance Test: Savina Reactors.IO OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Savina Reactors.IO 4484PX a px 800 1600 2400 3200 4000 3655.8 3506.4 3676.0 MIN: 3655.76 / MAX: 4484.97 MIN: 3506.38 / MAX: 4329.37 MAX: 4536.84
Renaissance Test: Apache Spark PageRank OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Apache Spark PageRank 4484PX a px 500 1000 1500 2000 2500 2138.1 2412.2 2229.7 MIN: 1499.64 MIN: 1691.04 MIN: 1612.96 / MAX: 2229.74
Renaissance Test: Finagle HTTP Requests OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Finagle HTTP Requests 4484PX a px 500 1000 1500 2000 2500 2492.2 2319.4 2483.1 MIN: 1947.63 MIN: 1832.84 MIN: 1933.43
Renaissance Test: Gaussian Mixture Model OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Gaussian Mixture Model 4484PX a px 800 1600 2400 3200 4000 3860.6 3399.5 3815.2 MIN: 2758.89 / MAX: 3860.61 MIN: 2471.52 MIN: 2749.56 / MAX: 3815.24
Renaissance Test: In-Memory Database Shootout OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: In-Memory Database Shootout 4484PX a px 700 1400 2100 2800 3500 3241.5 3256.1 3175.6 MIN: 3037.03 / MAX: 3491.91 MIN: 3019.89 / MAX: 3599.5 MIN: 2896.06 / MAX: 3367.44
Renaissance Test: Akka Unbalanced Cobwebbed Tree OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Akka Unbalanced Cobwebbed Tree 4484PX a px 900 1800 2700 3600 4500 4038.4 4403.8 4002.3 MIN: 4038.36 / MAX: 5089.28 MAX: 5719.11 MIN: 4002.27 / MAX: 4983.72
Renaissance Test: Genetic Algorithm Using Jenetics + Futures OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Genetic Algorithm Using Jenetics + Futures 4484PX a px 200 400 600 800 1000 904.0 732.8 920.7 MIN: 886.83 / MAX: 919.31 MIN: 713.67 / MAX: 813.49 MIN: 888.75 / MAX: 934.44
Rustls Benchmark: handshake - Suite: TLS13_CHACHA20_POLY1305_SHA256 OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake - Suite: TLS13_CHACHA20_POLY1305_SHA256 4484PX a px 16K 32K 48K 64K 80K 57716.64 76454.45 57688.08 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
Rustls Benchmark: handshake - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 4484PX a px 20K 40K 60K 80K 100K 59308.75 80462.60 59206.34 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
Rustls Benchmark: handshake-resume - Suite: TLS13_CHACHA20_POLY1305_SHA256 OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake-resume - Suite: TLS13_CHACHA20_POLY1305_SHA256 4484PX a px 80K 160K 240K 320K 400K 333882.92 388077.69 333574.30 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
Rustls Benchmark: handshake-ticket - Suite: TLS13_CHACHA20_POLY1305_SHA256 OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake-ticket - Suite: TLS13_CHACHA20_POLY1305_SHA256 4484PX a px 90K 180K 270K 360K 450K 344296.24 404263.45 342775.29 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
Rustls Benchmark: handshake - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 4484PX a px 90K 180K 270K 360K 450K 306153.20 423535.68 304060.28 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
Rustls Benchmark: handshake-resume - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake-resume - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 4484PX a px 800K 1600K 2400K 3200K 4000K 3035330.21 3563852.57 3038723.48 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
Rustls Benchmark: handshake-ticket - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake-ticket - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 4484PX a px 600K 1200K 1800K 2400K 3000K 2282729.64 2620332.00 2292879.44 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
Rustls Benchmark: handshake-resume - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake-resume - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 4484PX a px 400K 800K 1200K 1600K 2000K 1586292.42 1820810.21 1572010.68 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
Rustls Benchmark: handshake-ticket - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake-ticket - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 4484PX a px 300K 600K 900K 1200K 1500K 1329363.10 1553632.14 1340712.85 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
simdjson Throughput Test: Kostya OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: Kostya 4484PX a px 2 4 6 8 10 6.11 5.97 5.45 1. (CXX) g++ options: -O3 -lrt
simdjson Throughput Test: TopTweet OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: TopTweet 4484PX a px 3 6 9 12 15 10.82 10.46 10.51 1. (CXX) g++ options: -O3 -lrt
simdjson Throughput Test: LargeRandom OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: LargeRandom 4484PX a px 0.414 0.828 1.242 1.656 2.07 1.84 1.83 1.84 1. (CXX) g++ options: -O3 -lrt
simdjson Throughput Test: PartialTweets OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: PartialTweets 4484PX a px 3 6 9 12 15 10.10 9.76 8.35 1. (CXX) g++ options: -O3 -lrt
simdjson Throughput Test: DistinctUserID OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: DistinctUserID 4484PX a px 3 6 9 12 15 10.76 10.46 8.97 1. (CXX) g++ options: -O3 -lrt
Stockfish Chess Benchmark OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish Chess Benchmark 4484PX a px 10M 20M 30M 40M 50M 33702298 46507038 33871595 1. Stockfish 16 by the Stockfish developers (see AUTHORS file)
Stockfish Chess Benchmark OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 17 Chess Benchmark 4484PX a px 12M 24M 36M 48M 60M 45267546 54752796 42973396 1. (CXX) g++ options: -lgcov -m64 -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -funroll-loops -msse -msse3 -mpopcnt -mavx2 -mbmi -mavx512f -mavx512bw -mavx512vnni -mavx512dq -mavx512vl -msse4.1 -mssse3 -msse2 -mbmi2 -flto -flto-partition=one -flto=jobserver
SVT-AV1 Encoder Mode: Preset 3 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 3 - Input: Bosphorus 4K 4484PX a px 3 6 9 12 15 7.684 9.590 7.646 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 5 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 5 - Input: Bosphorus 4K 4484PX a px 8 16 24 32 40 29.09 34.54 28.82 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 8 - Input: Bosphorus 4K 4484PX a px 20 40 60 80 100 85.20 102.01 85.00 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 13 - Input: Bosphorus 4K 4484PX a px 50 100 150 200 250 198.11 212.52 194.02 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 3 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 3 - Input: Bosphorus 1080p 4484PX a px 7 14 21 28 35 25.45 29.57 25.45 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 5 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 5 - Input: Bosphorus 1080p 4484PX a px 20 40 60 80 100 88.42 101.97 88.27 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 8 - Input: Bosphorus 1080p 4484PX a px 70 140 210 280 350 287.05 339.02 286.96 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 13 - Input: Bosphorus 1080p 4484PX a px 200 400 600 800 1000 776.12 842.56 769.82 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 3 - Input: Beauty 4K 10-bit OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 3 - Input: Beauty 4K 10-bit 4484PX a px 0.32 0.64 0.96 1.28 1.6 1.188 1.422 1.184 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 5 - Input: Beauty 4K 10-bit OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 5 - Input: Beauty 4K 10-bit 4484PX a px 2 4 6 8 10 5.602 6.504 5.551 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 8 - Input: Beauty 4K 10-bit OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 8 - Input: Beauty 4K 10-bit 4484PX a px 3 6 9 12 15 10.97 12.47 10.86 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 13 - Input: Beauty 4K 10-bit OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 13 - Input: Beauty 4K 10-bit 4484PX a px 5 10 15 20 25 17.41 18.59 17.36 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Timed Eigen Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Eigen Compilation 3.4.0 Time To Compile 4484PX a px 15 30 45 60 75 67.36 58.66 67.08
Whisper.cpp Model: ggml-base.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.6.2 Model: ggml-base.en - Input: 2016 State of the Union 4484PX a px 20 40 60 80 100 92.71 87.49 93.45 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512vnni
Whisper.cpp Model: ggml-small.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.6.2 Model: ggml-small.en - Input: 2016 State of the Union 4484PX a px 60 120 180 240 300 268.24 245.08 266.81 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512vnni
Whisper.cpp Model: ggml-medium.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.6.2 Model: ggml-medium.en - Input: 2016 State of the Union 4484PX a px 200 400 600 800 1000 809.79 700.91 809.49 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512vnni
Whisperfile Model Size: Tiny OpenBenchmarking.org Seconds, Fewer Is Better Whisperfile 20Aug24 Model Size: Tiny 4484PX a px 10 20 30 40 50 37.13 41.71 38.72
Whisperfile Model Size: Small OpenBenchmarking.org Seconds, Fewer Is Better Whisperfile 20Aug24 Model Size: Small 4484PX a px 40 80 120 160 200 173.38 195.42 167.89
Whisperfile Model Size: Medium OpenBenchmarking.org Seconds, Fewer Is Better Whisperfile 20Aug24 Model Size: Medium 4484PX a px 120 240 360 480 600 473.55 534.92 475.51
x265 Video Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better x265 Video Input: Bosphorus 4K 4484PX a px 8 16 24 32 40 27.16 32.57 26.94 1. x265 [info]: HEVC encoder version 3.5+1-f0c1022b6
x265 Video Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better x265 Video Input: Bosphorus 1080p 4484PX a px 30 60 90 120 150 101.37 114.45 101.25 1. x265 [info]: HEVC encoder version 3.5+1-f0c1022b6
XNNPACK Model: FP32MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV1 4484PX a px 300 600 900 1200 1500 1257 1252 1272 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV2 4484PX a px 300 600 900 1200 1500 1365 1495 1368 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Large 4484PX a px 400 800 1200 1600 2000 1515 1810 1574 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Small 4484PX a px 200 400 600 800 1000 809 979 837 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV1 4484PX a px 300 600 900 1200 1500 1383 1143 1386 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV2 4484PX a px 300 600 900 1200 1500 1217 1190 1248 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Large 4484PX a px 300 600 900 1200 1500 1467 1498 1527 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Small 4484PX a px 200 400 600 800 1000 779 920 798 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: QS8MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: QS8MobileNetV2 4484PX a px 200 400 600 800 1000 717 844 723 1. (CXX) g++ options: -O3 -lrt -lm
Y-Cruncher Pi Digits To Calculate: 1B OpenBenchmarking.org Seconds, Fewer Is Better Y-Cruncher 0.8.5 Pi Digits To Calculate: 1B 4484PX a px 5 10 15 20 25 18.38 18.49 18.37
Y-Cruncher Pi Digits To Calculate: 500M OpenBenchmarking.org Seconds, Fewer Is Better Y-Cruncher 0.8.5 Pi Digits To Calculate: 500M 4484PX a px 2 4 6 8 10 8.688 8.772 8.623
Phoronix Test Suite v10.8.5