genoa tests eoy2024 Benchmarks for a future article. 2 x AMD EPYC 9124 16-Core testing with a AMD Titanite_4G (RTI1007B BIOS) and ASPEED on Ubuntu 23.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2412273-NE-GENOATEST41&rdt&grs .
genoa tests eoy2024 Processor Motherboard Chipset Memory Disk Graphics Network OS Kernel Compiler File-System Screen Resolution a b 2 x AMD EPYC 9124 16-Core @ 3.00GHz (32 Cores / 64 Threads) AMD Titanite_4G (RTI1007B BIOS) AMD Device 14a4 1520GB 3201GB Micron_7450_MTFDKCB3T2TFS ASPEED Broadcom NetXtreme BCM5720 PCIe Ubuntu 23.10 6.9.0-060900rc1daily20240327-generic (x86_64) GCC 13.2.0 ext4 1920x1200 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10113e Java Details - OpenJDK Runtime Environment (build 11.0.23+9-post-Ubuntu-1ubuntu123.10.1) Python Details - Python 3.11.6 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
genoa tests eoy2024 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024 onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard onnx: yolov4 - CPU - Standard onnx: ResNet101_DUC_HDC-12 - CPU - Standard llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024 litert: NASNet Mobile onnx: fcn-resnet101-11 - CPU - Standard llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024 svt-av1: Preset 13 - Bosphorus 4K renaissance: Rand Forest renaissance: Savina Reactors.IO webp: Default renaissance: Scala Dotty xnnpack: FP16MobileNetV3Large onnx: GPT-2 - CPU - Standard cpuminer-opt: Garlicoin xnnpack: FP16MobileNetV3Small svt-av1: Preset 13 - Bosphorus 1080p onnx: ArcFace ResNet-100 - CPU - Standard renaissance: Akka Unbalanced Cobwebbed Tree stockfish: Chess Benchmark renaissance: In-Memory Database Shootout renaissance: Apache Spark Bayes compress-7zip: Decompression Rating x265: Bosphorus 4K rustls: handshake - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 openvino-genai: Falcon-7b-instruct-int4-ov - CPU build-eigen: Time To Compile xnnpack: FP32MobileNetV3Large srsran: PDSCH Processor Benchmark, Throughput Total vvenc: Bosphorus 4K - Faster xnnpack: FP16MobileNetV2 xnnpack: FP32MobileNetV2 whisper-cpp: ggml-base.en - 2016 State of the Union vvenc: Bosphorus 1080p - Fast xnnpack: FP16MobileNetV1 x265: Bosphorus 1080p whisperfile: Small whisperfile: Tiny litert: Mobilenet Quant cpuminer-opt: x20r xnnpack: FP32MobileNetV1 vvenc: Bosphorus 1080p - Faster renaissance: ALS Movie Lens simdjson: Kostya litert: Mobilenet Float onednn: Recurrent Neural Network Training - CPU xnnpack: QS8MobileNetV2 onednn: Deconvolution Batch shapes_1d - CPU compress-7zip: Compression Rating litert: DeepLab V3 uvg266: Bosphorus 1080p - Very Fast llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128 litert: Inception V4 uvg266: Bosphorus 1080p - Ultra Fast simdjson: DistinctUserID y-cruncher: 5B warpx: Plasma Acceleration onnx: ZFNet-512 - CPU - Standard litert: SqueezeNet x265: Bosphorus 4K uvg266: Bosphorus 1080p - Super Fast renaissance: Genetic Algorithm Using Jenetics + Futures build2: Time To Compile xnnpack: FP32MobileNetV3Small rustls: handshake-resume - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 simdjson: PartialTweets webp: Quality 100 blender: BMW27 - CPU-Only litert: Quantized COCO SSD MobileNet v1 ospray: particle_volume/ao/real_time onnx: super-resolution-10 - CPU - Standard stress-ng: Context Switching onnx: ResNet50 v1-12-int8 - CPU - Standard palabos: 500 ospray: gravity_spheres_volume/dim_512/scivis/real_time simdjson: TopTweet stress-ng: CPU Stress onednn: IP Shapes 3D - CPU palabos: 400 simdjson: LargeRand build-php: Time To Compile namd: ATPase with 327,506 Atoms openvino-genai: Gemma-7b-int4-ov - CPU uvg266: Bosphorus 4K - Super Fast vvenc: Bosphorus 4K - Fast renaissance: Apache Spark PageRank svt-av1: Preset 5 - Bosphorus 4K y-cruncher: 1B webp: Quality 100, Lossless whisperfile: Medium onnx: CaffeNet 12-int8 - CPU - Standard gromacs: water_GMX50_bare rustls: handshake-ticket - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 renaissance: Gaussian Mixture Model z3: 2.smt2 onnx: bertsquad-12 - CPU - Standard cpuminer-opt: Magi ospray: gravity_spheres_volume/dim_512/ao/real_time uvg266: Bosphorus 4K - Ultra Fast llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128 onnx: T5 Encoder - CPU - Standard compress-lz4: 12 - Decompression Speed z3: 1.smt2 mt-dgemm: Sustained Floating-Point Rate cp2k: H20-256 onednn: Convolution Batch Shapes Auto - CPU whisper-cpp: ggml-medium.en - 2016 State of the Union whisper-cpp: ggml-small.en - 2016 State of the Union ospray: particle_volume/pathtracer/real_time compress-lz4: 12 - Compression Speed byte: Dhrystone 2 uvg266: Bosphorus 1080p - Slow svt-av1: Preset 5 - Bosphorus 1080p y-cruncher: 500M blender: Pabellon Barcelona - CPU-Only onednn: Deconvolution Batch shapes_3d - CPU svt-av1: Preset 3 - Bosphorus 4K openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU svt-av1: Preset 8 - Bosphorus 1080p cpuminer-opt: Ringcoin compress-lz4: 9 - Compression Speed openvino: Person Re-Identification Retail FP16 - CPU llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128 rustls: handshake-resume - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 openvino: Person Vehicle Bike Detection FP16 - CPU openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU openvino: Road Segmentation ADAS FP16 - CPU blender: Fishy Cat - CPU-Only uvg266: Bosphorus 4K - Very Fast compress-lz4: 2 - Compression Speed openssl: SHA256 uvg266: Bosphorus 1080p - Medium build-nodejs: Time To Compile openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Face Detection Retail FP16-INT8 - CPU compress-lz4: 9 - Decompression Speed ospray: particle_volume/scivis/real_time blender: Junkshop - CPU-Only openssl: AES-128-GCM onednn: IP Shapes 1D - CPU openvino: Vehicle Detection FP16 - CPU svt-av1: Preset 3 - Bosphorus 1080p openvino: Road Segmentation ADAS FP16-INT8 - CPU uvg266: Bosphorus 4K - Medium palabos: 100 warpx: Uniform Plasma ospray: gravity_spheres_volume/dim_512/pathtracer/real_time x265: Bosphorus 1080p openvino: Road Segmentation ADAS FP16 - CPU compress-lz4: 3 - Decompression Speed openvino: Handwritten English Recognition FP16-INT8 - CPU namd: STMV with 1,066,628 Atoms openvino: Person Detection FP16 - CPU openvino: Person Detection FP16 - CPU onednn: Recurrent Neural Network Inference - CPU laghos: Triple Point Problem openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Face Detection Retail FP16-INT8 - CPU cpuminer-opt: Triple SHA-256, Onecoin openssl: ChaCha20 openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Machine Translation EN To DE FP16 - CPU c-ray: 4K - 16 blender: Barbershop - CPU-Only rustls: handshake-ticket - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 openvino: Machine Translation EN To DE FP16 - CPU astcenc: Thorough stress-ng: Radix String Sort openvino: Face Detection Retail FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU quantlib: XXS openvino: Noise Suppression Poconet-Like FP16 - CPU stress-ng: Socket Activity openvino: Age Gender Recognition Retail 0013 FP16 - CPU compress-lz4: 1 - Compression Speed openvino: Weld Porosity Detection FP16-INT8 - CPU blender: Classroom - CPU-Only cpuminer-opt: Deepcoin openvino: Road Segmentation ADAS FP16-INT8 - CPU byte: Whetstone Double uvg266: Bosphorus 4K - Slow openvino: Vehicle Detection FP16-INT8 - CPU rustls: handshake - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 openssl: AES-256-GCM compress-lz4: 2 - Decompression Speed astcenc: Fast c-ray: 1080p - 16 astcenc: Very Thorough openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection FP16 - CPU astcenc: Exhaustive openvino: Person Re-Identification Retail FP16 - CPU openvino: Noise Suppression Poconet-Like FP16 - CPU srsran: PUSCH Processor Benchmark, Throughput Total openvino: Person Detection FP32 - CPU compress-lz4: 1 - Decompression Speed svt-av1: Preset 8 - Bosphorus 4K primesieve: 1e12 primesieve: 1e13 openssl: RSA4096 cpuminer-opt: scrypt openssl: AES-128-GCM quantlib: S cpuminer-opt: Quad SHA-256, Pyrite byte: Pipe compress-lz4: 3 - Compression Speed litert: Inception ResNet V2 openvino: Person Detection FP32 - CPU openvino: Face Detection FP16 - CPU openssl: SHA512 byte: System Call stress-ng: Bitonic Integer Sort cpuminer-opt: Blake-2 S openssl: ChaCha20-Poly1305 openssl: SHA256 openssl: ChaCha20-Poly1305 renaissance: Finagle HTTP Requests openvino: Face Detection FP16-INT8 - CPU astcenc: Medium openssl: AES-256-GCM openssl: ChaCha20 c-ray: 5K - 16 laghos: Sedov Blast Wave, ube_922_hex.mesh openssl: SHA512 openssl: RSA4096 openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Face Detection Retail FP16 - CPU cpuminer-opt: LBC, LBRY Credits cpuminer-opt: Myriad-Groestl cpuminer-opt: Skeincoin oidn: RTLightmap.hdr.4096x4096 - CPU-Only oidn: RT.ldr_alb_nrm.3840x2160 - CPU-Only oidn: RT.hdr_alb_nrm.3840x2160 - CPU-Only webp: Quality 100, Lossless, Highest Compression webp: Quality 100, Highest Compression openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU - Time Per Output Token openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU - Time To First Token openvino-genai: Falcon-7b-instruct-int4-ov - CPU - Time Per Output Token openvino-genai: Falcon-7b-instruct-int4-ov - CPU - Time To First Token openvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPU - Time Per Output Token openvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPU - Time To First Token openvino-genai: Gemma-7b-int4-ov - CPU - Time Per Output Token openvino-genai: Gemma-7b-int4-ov - CPU - Time To First Token onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard onnx: ResNet101_DUC_HDC-12 - CPU - Standard onnx: super-resolution-10 - CPU - Standard onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Standard onnx: CaffeNet 12-int8 - CPU - Standard onnx: bertsquad-12 - CPU - Standard onnx: T5 Encoder - CPU - Standard onnx: ZFNet-512 - CPU - Standard onnx: yolov4 - CPU - Standard onnx: GPT-2 - CPU - Standard renaissance: Apache Spark ALS a b 267.01 33.7484 6.98157 1.67151 112.18 42858.2 3.6652 99.46 184.519 788.0 10442.4 18.60 801.9 5309 138.699 7384.28 3878 529.371 29.6861 18669.0 100001520 6745.9 260.0 239848 30.76 401303.16 25.47 42.209 5319 40641.5 13.228 3458 3628 138.26158 19.411 2095 87.75 154.79938 50.45276 1747.96 12480 2145 39.997 25095.1 4.18 2057.94 806.396 3611 5.80994 256434 4985.32 121.3 74.22 23103.8 143.36 6.66 52.861 28.40062329 123.608 3102.45 26.13 131.92 2237.2 80.398 4008 1663749.32 6.55 11.30 36.74 3245.22 12.8579 99.6682 7693259.49 222.671 412.575 10.2963 7.21 83077.92 0.724443 397.005 1.25 44.663 5.17004 20.24 39.26 6.979 3766.0 31.929 9.23 1.44 343.12909 584.297 3.837 1952598.53 4435.4 78.124 13.848 952.87 10.5125 42.08 25.81 261.652 4331 32.661 1704.035273 221.369 1.09519 888.00606 287.77075 192.693 12.7 2867577563.1 44.16 84.194 4.485 115.05 1.72958 9.588 89464.75 243.702 5108.3 36.67 3.38 27.16 2683018.02 1939.89 36.58 676.1 48.58 38.45 315.74 52088429500 48.47 273.332 4.11 1547.61 1003.55 51.47 31.85 4.67 4217.5 12.947 52.71 391651514500 0.877683 5.14 26.323 10.78 17.14 292.345 19.57102025 12.3457 76.04 11.82 4031 29.92 1.57997 209.35 38.19 458.249 219.97 1068.15 6810.88 121450 198651380530 8.16 36.89 74.731 361.09 1342172.28 216.75 29.8005 739.37 4519.36 2032.87 28.0287 11.82 17142.77 60277.78 695.86 3909.5 95.81 13680 740.54 500231.6 15.47 2233.42 82026.08 335590591420 3856.4 534.8428 18.839 4.1034 39.12 20.83 2.5106 2362.71 2676.03 2260.8 38.2 4460.4 91.143 4.15 50.267 558789.7 482.67 393314332400 27.5863 102660 58878998 107.13 26499.5 209.21 383.43 16883923960 47042441.2 385.6 220790 141481760360 52436660260 141514205590 3928.6 204.27 221.8177 334767993920 198630347840 132.726 284.26 16971437080 22944.9 0.34 0.52 15.72 3.56 1.76 24690 17760 67110 0.67 1.41 1.41 0.57 3.60 27.34 58.48 39.26 104.07 19.43 21.71 49.41 115.87 29.6269 598.257 10.0326 4.48901 33.683 272.833 1.71077 72.2085 3.82017 8.08821 143.23 7.2044 189.81 43.4229 8.59803 1.94512 96.97 48856.7 4.16952 110.77 166.11 713.7 11336.1 17.33 751.0 4990 130.761 6969.32 3694 505.636 31.0624 19497.4 104118302 7016.4 250.5 248785 29.69 387481.69 26.36 40.787 5500 41979.9 13.627 3358 3733 134.45784 19.943 2040 85.5 150.91319 51.75138 1708.44 12750 2101 39.205 25588.8 4.26 2020.43 821.069 3547 5.70868 251980 5073.41 123.34 73.02 22746.1 141.19 6.56 52.084 28.81928408 125.379 3060.9 25.79 133.52 2264.2 81.362 3963 1681659.16 6.62 11.42 36.36 3213.83 12.9818 98.7203 7764860.59 220.75 409.041 10.3832 7.27 83768.42 0.718503 400.261 1.26 44.321 5.13084 20.09 39.55 6.928 3793.3 32.159 9.166 1.45 345.49469 580.471 3.862 1965288.44 4406.8 77.621 13.9324 958.57 10.4511 41.85 25.67 263.007 4310.1 32.815 1711.894448 222.385 1.10013 884.09875 289.01438 191.877 12.65 2856294482 43.99 84.519 4.502 114.62 1.7359 9.555 89738.8 244.444 5092.89 36.78 3.37 27.24 2675450.81 1945.3 36.48 674.27 48.71 38.35 316.56 52220542360 48.59 274.008 4.1 1551.35 1005.94 51.35 31.78 4.66 4226.5 12.92 52.82 392446885890 0.875916 5.13 26.273 10.8 17.11 291.837 19.6050734 12.367 76.17 11.84 4037.8 29.97 1.57748 209.68 38.13 457.556 220.28 1066.68 6820.14 121600 198407362970 8.15 36.93 74.651 361.47 1340803.72 216.55 29.8279 738.7 4523.45 2031.11 28.0526 11.81 17128.52 60327.61 695.34 3912.36 95.74 13690 740.04 499902.4 15.46 2234.86 81973.92 335400128510 3854.3 534.5559 18.849 4.1055 39.1 20.84 2.5098 2363.42 2676.83 2260.2 38.21 4459.3 91.165 4.151 50.279 558659.8 482.56 393393574980 27.5917 102640 58867931.7 107.11 26503.6 209.24 383.38 16881854110 47036981.1 385.56 220810 141469554280 52440112330 141522365640 3928.4 204.28 221.8274 334753808110 198638149630 132.731 284.27 16971041430 22944.6 0.34 0.52 15.72 3.56 1.76 24690 17760 67110 0.67 1.41 1.41 0.57 3.60 27.41 57.04 37.93 98.97 19.47 21.89 49.77 114.75 23.0254 514.102 10.1289 4.5277 32.1906 239.832 1.72194 71.771 3.80081 7.97344 116.302 7.64339 OpenBenchmarking.org
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 a b 60 120 180 240 300 267.01 189.81 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard a b 10 20 30 40 50 33.75 43.42 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: yolov4 - Device: CPU - Executor: Standard a b 2 4 6 8 10 6.98157 8.59803 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard a b 0.4377 0.8754 1.3131 1.7508 2.1885 1.67151 1.94512 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 a b 30 60 90 120 150 112.18 96.97 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
LiteRT Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: NASNet Mobile a b 10K 20K 30K 40K 50K 42858.2 48856.7
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard a b 0.9381 1.8762 2.8143 3.7524 4.6905 3.66520 4.16952 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 a b 20 40 60 80 100 99.46 110.77 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 13 - Input: Bosphorus 4K a b 40 80 120 160 200 184.52 166.11 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Renaissance Test: Random Forest OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Random Forest a b 200 400 600 800 1000 788.0 713.7 MIN: 591.22 / MAX: 890.34 MIN: 579.9 / MAX: 908.5
Renaissance Test: Savina Reactors.IO OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Savina Reactors.IO a b 2K 4K 6K 8K 10K 10442.4 11336.1 MIN: 9259.86 / MAX: 11965.68 MIN: 9682.35 / MAX: 13514.5
WebP Image Encode Encode Settings: Default OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.4 Encode Settings: Default a b 5 10 15 20 25 18.60 17.33 1. (CC) gcc options: -fvisibility=hidden -O2 -lm
Renaissance Test: Scala Dotty OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Scala Dotty a b 200 400 600 800 1000 801.9 751.0 MIN: 655.46 / MAX: 1303.91 MIN: 640.27 / MAX: 1261.2
XNNPACK Model: FP16MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Large a b 1100 2200 3300 4400 5500 5309 4990 1. (CXX) g++ options: -O3 -lrt -lm
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: GPT-2 - Device: CPU - Executor: Standard a b 30 60 90 120 150 138.70 130.76 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
Cpuminer-Opt Algorithm: Garlicoin OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 24.3 Algorithm: Garlicoin a b 1600 3200 4800 6400 8000 7384.28 6969.32 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lgmp
XNNPACK Model: FP16MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Small a b 800 1600 2400 3200 4000 3878 3694 1. (CXX) g++ options: -O3 -lrt -lm
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 13 - Input: Bosphorus 1080p a b 110 220 330 440 550 529.37 505.64 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard a b 7 14 21 28 35 29.69 31.06 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
Renaissance Test: Akka Unbalanced Cobwebbed Tree OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Akka Unbalanced Cobwebbed Tree a b 4K 8K 12K 16K 20K 18669.0 19497.4 MIN: 17763.87 / MAX: 19829.6 MIN: 19145.14 / MAX: 19580.86
Stockfish Chess Benchmark OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 17 Chess Benchmark a b 20M 40M 60M 80M 100M 100001520 104118302 1. (CXX) g++ options: -lgcov -m64 -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -funroll-loops -msse -msse3 -mpopcnt -mavx2 -mbmi -mavx512f -mavx512bw -mavx512vnni -mavx512dq -mavx512vl -msse4.1 -mssse3 -msse2 -mbmi2 -flto -flto-partition=one -flto=jobserver
Renaissance Test: In-Memory Database Shootout OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: In-Memory Database Shootout a b 1500 3000 4500 6000 7500 6745.9 7016.4 MIN: 5952.22 / MAX: 7932.15 MIN: 6357.26 / MAX: 8122.82
Renaissance Test: Apache Spark Bayes OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Apache Spark Bayes a b 60 120 180 240 300 260.0 250.5 MIN: 219.27 / MAX: 423.67 MIN: 217.63 / MAX: 448.45
7-Zip Compression Test: Decompression Rating OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 24.05 Test: Decompression Rating a b 50K 100K 150K 200K 250K 239848 248785 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
x265 Video Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better x265 4.1 Video Input: Bosphorus 4K a b 7 14 21 28 35 30.76 29.69 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Rustls Benchmark: handshake - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 a b 90K 180K 270K 360K 450K 401303.16 387481.69 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
OpenVINO GenAI Model: Falcon-7b-instruct-int4-ov - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: Falcon-7b-instruct-int4-ov - Device: CPU a b 6 12 18 24 30 25.47 26.36
Timed Eigen Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Eigen Compilation 3.4.0 Time To Compile a b 10 20 30 40 50 42.21 40.79
XNNPACK Model: FP32MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Large a b 1200 2400 3600 4800 6000 5319 5500 1. (CXX) g++ options: -O3 -lrt -lm
srsRAN Project Test: PDSCH Processor Benchmark, Throughput Total OpenBenchmarking.org Mbps, More Is Better srsRAN Project 24.10 Test: PDSCH Processor Benchmark, Throughput Total a b 9K 18K 27K 36K 45K 40641.5 41979.9 1. (CXX) g++ options: -O3 -march=native -mtune=generic -fno-trapping-math -fno-math-errno -ldl
VVenC Video Input: Bosphorus 4K - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.13 Video Input: Bosphorus 4K - Video Preset: Faster a b 4 8 12 16 20 13.23 13.63 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
XNNPACK Model: FP16MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV2 a b 700 1400 2100 2800 3500 3458 3358 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV2 a b 800 1600 2400 3200 4000 3628 3733 1. (CXX) g++ options: -O3 -lrt -lm
Whisper.cpp Model: ggml-base.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.6.2 Model: ggml-base.en - Input: 2016 State of the Union a b 30 60 90 120 150 138.26 134.46 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512vnni
VVenC Video Input: Bosphorus 1080p - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.13 Video Input: Bosphorus 1080p - Video Preset: Fast a b 5 10 15 20 25 19.41 19.94 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
XNNPACK Model: FP16MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV1 a b 400 800 1200 1600 2000 2095 2040 1. (CXX) g++ options: -O3 -lrt -lm
x265 Video Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better x265 4.1 Video Input: Bosphorus 1080p a b 20 40 60 80 100 87.75 85.50 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Whisperfile Model Size: Small OpenBenchmarking.org Seconds, Fewer Is Better Whisperfile 20Aug24 Model Size: Small a b 30 60 90 120 150 154.80 150.91
Whisperfile Model Size: Tiny OpenBenchmarking.org Seconds, Fewer Is Better Whisperfile 20Aug24 Model Size: Tiny a b 12 24 36 48 60 50.45 51.75
LiteRT Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Quant a b 400 800 1200 1600 2000 1747.96 1708.44
Cpuminer-Opt Algorithm: x20r OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 24.3 Algorithm: x20r a b 3K 6K 9K 12K 15K 12480 12750 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lgmp
XNNPACK Model: FP32MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV1 a b 500 1000 1500 2000 2500 2145 2101 1. (CXX) g++ options: -O3 -lrt -lm
VVenC Video Input: Bosphorus 1080p - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.13 Video Input: Bosphorus 1080p - Video Preset: Faster a b 9 18 27 36 45 40.00 39.21 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
Renaissance Test: ALS Movie Lens OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: ALS Movie Lens a b 5K 10K 15K 20K 25K 25095.1 25588.8 MIN: 24327.2 MIN: 24309
simdjson Throughput Test: Kostya OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: Kostya a b 0.9585 1.917 2.8755 3.834 4.7925 4.18 4.26 1. (CXX) g++ options: -O3 -lrt
LiteRT Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Float a b 400 800 1200 1600 2000 2057.94 2020.43
oneDNN Harness: Recurrent Neural Network Training - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Training - Engine: CPU a b 200 400 600 800 1000 806.40 821.07 MIN: 803.94 MIN: 801.71 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
XNNPACK Model: QS8MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: QS8MobileNetV2 a b 800 1600 2400 3200 4000 3611 3547 1. (CXX) g++ options: -O3 -lrt -lm
oneDNN Harness: Deconvolution Batch shapes_1d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_1d - Engine: CPU a b 1.3072 2.6144 3.9216 5.2288 6.536 5.80994 5.70868 MIN: 3.84 MIN: 5.17 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
7-Zip Compression Test: Compression Rating OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 24.05 Test: Compression Rating a b 50K 100K 150K 200K 250K 256434 251980 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
LiteRT Model: DeepLab V3 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: DeepLab V3 a b 1100 2200 3300 4400 5500 4985.32 5073.41
uvg266 Video Input: Bosphorus 1080p - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.8.0 Video Input: Bosphorus 1080p - Video Preset: Very Fast a b 30 60 90 120 150 121.30 123.34
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 a b 16 32 48 64 80 74.22 73.02 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
LiteRT Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception V4 a b 5K 10K 15K 20K 25K 23103.8 22746.1
uvg266 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.8.0 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast a b 30 60 90 120 150 143.36 141.19
simdjson Throughput Test: DistinctUserID OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: DistinctUserID a b 2 4 6 8 10 6.66 6.56 1. (CXX) g++ options: -O3 -lrt
Y-Cruncher Pi Digits To Calculate: 5B OpenBenchmarking.org Seconds, Fewer Is Better Y-Cruncher 0.8.5 Pi Digits To Calculate: 5B a b 12 24 36 48 60 52.86 52.08
WarpX Input: Plasma Acceleration OpenBenchmarking.org Seconds, Fewer Is Better WarpX 24.10 Input: Plasma Acceleration a b 7 14 21 28 35 28.40 28.82 1. (CXX) g++ options: -O3 -lm
ONNX Runtime Model: ZFNet-512 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ZFNet-512 - Device: CPU - Executor: Standard a b 30 60 90 120 150 123.61 125.38 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
LiteRT Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: SqueezeNet a b 700 1400 2100 2800 3500 3102.45 3060.90
x265 Video Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better x265 Video Input: Bosphorus 4K a b 6 12 18 24 30 26.13 25.79 1. x265 [info]: HEVC encoder version 3.5+1-f0c1022b6
uvg266 Video Input: Bosphorus 1080p - Video Preset: Super Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.8.0 Video Input: Bosphorus 1080p - Video Preset: Super Fast a b 30 60 90 120 150 131.92 133.52
Renaissance Test: Genetic Algorithm Using Jenetics + Futures OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Genetic Algorithm Using Jenetics + Futures a b 500 1000 1500 2000 2500 2237.2 2264.2 MIN: 1595.98 MIN: 1665.04
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.17 Time To Compile a b 20 40 60 80 100 80.40 81.36
XNNPACK Model: FP32MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Small a b 900 1800 2700 3600 4500 4008 3963 1. (CXX) g++ options: -O3 -lrt -lm
Rustls Benchmark: handshake-resume - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake-resume - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 a b 400K 800K 1200K 1600K 2000K 1663749.32 1681659.16 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
simdjson Throughput Test: PartialTweets OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: PartialTweets a b 2 4 6 8 10 6.55 6.62 1. (CXX) g++ options: -O3 -lrt
WebP Image Encode Encode Settings: Quality 100 OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.4 Encode Settings: Quality 100 a b 3 6 9 12 15 11.30 11.42 1. (CC) gcc options: -fvisibility=hidden -O2 -lm
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: BMW27 - Compute: CPU-Only a b 8 16 24 32 40 36.74 36.36
LiteRT Model: Quantized COCO SSD MobileNet v1 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Quantized COCO SSD MobileNet v1 a b 700 1400 2100 2800 3500 3245.22 3213.83
OSPRay Benchmark: particle_volume/ao/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.2 Benchmark: particle_volume/ao/real_time a b 3 6 9 12 15 12.86 12.98
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: super-resolution-10 - Device: CPU - Executor: Standard a b 20 40 60 80 100 99.67 98.72 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
Stress-NG Test: Context Switching OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.17.08 Test: Context Switching a b 1.7M 3.4M 5.1M 6.8M 8.5M 7693259.49 7764860.59 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard a b 50 100 150 200 250 222.67 220.75 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
Palabos Grid Size: 500 OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 500 a b 90 180 270 360 450 412.58 409.04 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
OSPRay Benchmark: gravity_spheres_volume/dim_512/scivis/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.2 Benchmark: gravity_spheres_volume/dim_512/scivis/real_time a b 3 6 9 12 15 10.30 10.38
simdjson Throughput Test: TopTweet OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: TopTweet a b 2 4 6 8 10 7.21 7.27 1. (CXX) g++ options: -O3 -lrt
Stress-NG Test: CPU Stress OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.17.08 Test: CPU Stress a b 20K 40K 60K 80K 100K 83077.92 83768.42 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
oneDNN Harness: IP Shapes 3D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 3D - Engine: CPU a b 0.163 0.326 0.489 0.652 0.815 0.724443 0.718503 MIN: 0.69 MIN: 0.69 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
Palabos Grid Size: 400 OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 400 a b 90 180 270 360 450 397.01 400.26 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
simdjson Throughput Test: LargeRandom OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: LargeRandom a b 0.2835 0.567 0.8505 1.134 1.4175 1.25 1.26 1. (CXX) g++ options: -O3 -lrt
Timed PHP Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed PHP Compilation 8.3.4 Time To Compile a b 10 20 30 40 50 44.66 44.32
NAMD Input: ATPase with 327,506 Atoms OpenBenchmarking.org ns/day, More Is Better NAMD 3.0 Input: ATPase with 327,506 Atoms a b 1.1633 2.3266 3.4899 4.6532 5.8165 5.17004 5.13084
OpenVINO GenAI Model: Gemma-7b-int4-ov - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: Gemma-7b-int4-ov - Device: CPU a b 5 10 15 20 25 20.24 20.09
uvg266 Video Input: Bosphorus 4K - Video Preset: Super Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.8.0 Video Input: Bosphorus 4K - Video Preset: Super Fast a b 9 18 27 36 45 39.26 39.55
VVenC Video Input: Bosphorus 4K - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.13 Video Input: Bosphorus 4K - Video Preset: Fast a b 2 4 6 8 10 6.979 6.928 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
Renaissance Test: Apache Spark PageRank OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Apache Spark PageRank a b 800 1600 2400 3200 4000 3766.0 3793.3 MIN: 3285.53 / MAX: 3766.03 MIN: 3259.12 / MAX: 3793.31
SVT-AV1 Encoder Mode: Preset 5 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 5 - Input: Bosphorus 4K a b 7 14 21 28 35 31.93 32.16 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Y-Cruncher Pi Digits To Calculate: 1B OpenBenchmarking.org Seconds, Fewer Is Better Y-Cruncher 0.8.5 Pi Digits To Calculate: 1B a b 3 6 9 12 15 9.230 9.166
WebP Image Encode Encode Settings: Quality 100, Lossless OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.4 Encode Settings: Quality 100, Lossless a b 0.3263 0.6526 0.9789 1.3052 1.6315 1.44 1.45 1. (CC) gcc options: -fvisibility=hidden -O2 -lm
Whisperfile Model Size: Medium OpenBenchmarking.org Seconds, Fewer Is Better Whisperfile 20Aug24 Model Size: Medium a b 80 160 240 320 400 343.13 345.49
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard a b 130 260 390 520 650 584.30 580.47 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
GROMACS Input: water_GMX50_bare OpenBenchmarking.org Ns Per Day, More Is Better GROMACS Input: water_GMX50_bare a b 0.869 1.738 2.607 3.476 4.345 3.837 3.862 1. GROMACS version: 2023.1-Ubuntu_2023.1_2ubuntu1
Rustls Benchmark: handshake-ticket - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake-ticket - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 a b 400K 800K 1200K 1600K 2000K 1952598.53 1965288.44 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
Renaissance Test: Gaussian Mixture Model OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Gaussian Mixture Model a b 1000 2000 3000 4000 5000 4435.4 4406.8 MIN: 4394.25 / MAX: 4847.19 MIN: 4396.48 / MAX: 4764.38
Z3 Theorem Prover SMT File: 2.smt2 OpenBenchmarking.org Seconds, Fewer Is Better Z3 Theorem Prover 4.12.1 SMT File: 2.smt2 a b 20 40 60 80 100 78.12 77.62 1. (CXX) g++ options: -lpthread -std=c++17 -fvisibility=hidden -mfpmath=sse -msse -msse2 -O3 -fPIC
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: bertsquad-12 - Device: CPU - Executor: Standard a b 4 8 12 16 20 13.85 13.93 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
Cpuminer-Opt Algorithm: Magi OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 24.3 Algorithm: Magi a b 200 400 600 800 1000 952.87 958.57 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lgmp
OSPRay Benchmark: gravity_spheres_volume/dim_512/ao/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.2 Benchmark: gravity_spheres_volume/dim_512/ao/real_time a b 3 6 9 12 15 10.51 10.45
uvg266 Video Input: Bosphorus 4K - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.8.0 Video Input: Bosphorus 4K - Video Preset: Ultra Fast a b 10 20 30 40 50 42.08 41.85
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 a b 6 12 18 24 30 25.81 25.67 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
ONNX Runtime Model: T5 Encoder - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: T5 Encoder - Device: CPU - Executor: Standard a b 60 120 180 240 300 261.65 263.01 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
LZ4 Compression Compression Level: 12 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.10 Compression Level: 12 - Decompression Speed a b 900 1800 2700 3600 4500 4331.0 4310.1 1. (CC) gcc options: -O3 -pthread
Z3 Theorem Prover SMT File: 1.smt2 OpenBenchmarking.org Seconds, Fewer Is Better Z3 Theorem Prover 4.12.1 SMT File: 1.smt2 a b 8 16 24 32 40 32.66 32.82 1. (CXX) g++ options: -lpthread -std=c++17 -fvisibility=hidden -mfpmath=sse -msse -msse2 -O3 -fPIC
ACES DGEMM Sustained Floating-Point Rate OpenBenchmarking.org GFLOP/s, More Is Better ACES DGEMM 1.0 Sustained Floating-Point Rate a b 400 800 1200 1600 2000 1704.04 1711.89 1. (CC) gcc options: -ffast-math -mavx2 -O3 -fopenmp -lopenblas
CP2K Molecular Dynamics Input: H20-256 OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2024.3 Input: H20-256 a b 50 100 150 200 250 221.37 222.39 1. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
oneDNN Harness: Convolution Batch Shapes Auto - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Convolution Batch Shapes Auto - Engine: CPU a b 0.2475 0.495 0.7425 0.99 1.2375 1.09519 1.10013 MIN: 1.07 MIN: 1.07 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
Whisper.cpp Model: ggml-medium.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.6.2 Model: ggml-medium.en - Input: 2016 State of the Union a b 200 400 600 800 1000 888.01 884.10 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512vnni
Whisper.cpp Model: ggml-small.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.6.2 Model: ggml-small.en - Input: 2016 State of the Union a b 60 120 180 240 300 287.77 289.01 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512vnni
OSPRay Benchmark: particle_volume/pathtracer/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.2 Benchmark: particle_volume/pathtracer/real_time a b 40 80 120 160 200 192.69 191.88
LZ4 Compression Compression Level: 12 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.10 Compression Level: 12 - Compression Speed a b 3 6 9 12 15 12.70 12.65 1. (CC) gcc options: -O3 -pthread
BYTE Unix Benchmark Computational Test: Dhrystone 2 OpenBenchmarking.org LPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: Dhrystone 2 a b 600M 1200M 1800M 2400M 3000M 2867577563.1 2856294482.0 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
uvg266 Video Input: Bosphorus 1080p - Video Preset: Slow OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.8.0 Video Input: Bosphorus 1080p - Video Preset: Slow a b 10 20 30 40 50 44.16 43.99
SVT-AV1 Encoder Mode: Preset 5 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 5 - Input: Bosphorus 1080p a b 20 40 60 80 100 84.19 84.52 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Y-Cruncher Pi Digits To Calculate: 500M OpenBenchmarking.org Seconds, Fewer Is Better Y-Cruncher 0.8.5 Pi Digits To Calculate: 500M a b 1.013 2.026 3.039 4.052 5.065 4.485 4.502
Blender Blend File: Pabellon Barcelona - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Pabellon Barcelona - Compute: CPU-Only a b 30 60 90 120 150 115.05 114.62
oneDNN Harness: Deconvolution Batch shapes_3d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_3d - Engine: CPU a b 0.3906 0.7812 1.1718 1.5624 1.953 1.72958 1.73590 MIN: 1.72 MIN: 1.72 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
SVT-AV1 Encoder Mode: Preset 3 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 3 - Input: Bosphorus 4K a b 3 6 9 12 15 9.588 9.555 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU a b 20K 40K 60K 80K 100K 89464.75 89738.80 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 8 - Input: Bosphorus 1080p a b 50 100 150 200 250 243.70 244.44 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Cpuminer-Opt Algorithm: Ringcoin OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 24.3 Algorithm: Ringcoin a b 1100 2200 3300 4400 5500 5108.30 5092.89 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lgmp
LZ4 Compression Compression Level: 9 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.10 Compression Level: 9 - Compression Speed a b 8 16 24 32 40 36.67 36.78 1. (CC) gcc options: -O3 -pthread
OpenVINO Model: Person Re-Identification Retail FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Re-Identification Retail FP16 - Device: CPU a b 0.7605 1.521 2.2815 3.042 3.8025 3.38 3.37 MIN: 3.3 / MAX: 10.35 MIN: 3.3 / MAX: 10.12 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 a b 6 12 18 24 30 27.16 27.24 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Rustls Benchmark: handshake-resume - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake-resume - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 a b 600K 1200K 1800K 2400K 3000K 2683018.02 2675450.81 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Vehicle Bike Detection FP16 - Device: CPU a b 400 800 1200 1600 2000 1939.89 1945.30 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO GenAI Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU a b 8 16 24 32 40 36.58 36.48
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16 - Device: CPU a b 150 300 450 600 750 676.10 674.27 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
Blender Blend File: Fishy Cat - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Fishy Cat - Compute: CPU-Only a b 11 22 33 44 55 48.58 48.71
uvg266 Video Input: Bosphorus 4K - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.8.0 Video Input: Bosphorus 4K - Video Preset: Very Fast a b 9 18 27 36 45 38.45 38.35
LZ4 Compression Compression Level: 2 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.10 Compression Level: 2 - Compression Speed a b 70 140 210 280 350 315.74 316.56 1. (CC) gcc options: -O3 -pthread
OpenSSL Algorithm: SHA256 OpenBenchmarking.org byte/s, More Is Better OpenSSL Algorithm: SHA256 a b 11000M 22000M 33000M 44000M 55000M 52088429500 52220542360 1. OpenSSL 3.0.10 1 Aug 2023 (Library: OpenSSL 3.0.10 1 Aug 2023)
uvg266 Video Input: Bosphorus 1080p - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.8.0 Video Input: Bosphorus 1080p - Video Preset: Medium a b 11 22 33 44 55 48.47 48.59
Timed Node.js Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Node.js Compilation 21.7.2 Time To Compile a b 60 120 180 240 300 273.33 274.01
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Vehicle Bike Detection FP16 - Device: CPU a b 0.9248 1.8496 2.7744 3.6992 4.624 4.11 4.10 MIN: 4.01 / MAX: 12.01 MIN: 4 / MAX: 11.38 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16 - Device: CPU a b 300 600 900 1200 1500 1547.61 1551.35 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16 - Device: CPU a b 200 400 600 800 1000 1003.55 1005.94 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO GenAI Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU a b 12 24 36 48 60 51.47 51.35
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16 - Device: CPU a b 7 14 21 28 35 31.85 31.78 MIN: 30.68 / MAX: 39.98 MIN: 30.57 / MAX: 40.79 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16-INT8 - Device: CPU a b 1.0508 2.1016 3.1524 4.2032 5.254 4.67 4.66 MIN: 4.61 / MAX: 13.07 MIN: 4.61 / MAX: 13.06 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
LZ4 Compression Compression Level: 9 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.10 Compression Level: 9 - Decompression Speed a b 900 1800 2700 3600 4500 4217.5 4226.5 1. (CC) gcc options: -O3 -pthread
OSPRay Benchmark: particle_volume/scivis/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.2 Benchmark: particle_volume/scivis/real_time a b 3 6 9 12 15 12.95 12.92
Blender Blend File: Junkshop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Junkshop - Compute: CPU-Only a b 12 24 36 48 60 52.71 52.82
OpenSSL Algorithm: AES-128-GCM OpenBenchmarking.org byte/s, More Is Better OpenSSL Algorithm: AES-128-GCM a b 80000M 160000M 240000M 320000M 400000M 391651514500 392446885890 1. OpenSSL 3.0.10 1 Aug 2023 (Library: OpenSSL 3.0.10 1 Aug 2023)
oneDNN Harness: IP Shapes 1D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 1D - Engine: CPU a b 0.1975 0.395 0.5925 0.79 0.9875 0.877683 0.875916 MIN: 0.85 MIN: 0.84 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16 - Device: CPU a b 1.1565 2.313 3.4695 4.626 5.7825 5.14 5.13 MIN: 5.08 / MAX: 14.02 MIN: 5.06 / MAX: 15.63 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
SVT-AV1 Encoder Mode: Preset 3 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 3 - Input: Bosphorus 1080p a b 6 12 18 24 30 26.32 26.27 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU a b 3 6 9 12 15 10.78 10.80 MIN: 10.55 / MAX: 18.51 MIN: 10.54 / MAX: 23.01 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
uvg266 Video Input: Bosphorus 4K - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.8.0 Video Input: Bosphorus 4K - Video Preset: Medium a b 4 8 12 16 20 17.14 17.11
Palabos Grid Size: 100 OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 100 a b 60 120 180 240 300 292.35 291.84 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
WarpX Input: Uniform Plasma OpenBenchmarking.org Seconds, Fewer Is Better WarpX 24.10 Input: Uniform Plasma a b 5 10 15 20 25 19.57 19.61 1. (CXX) g++ options: -O3 -lm
OSPRay Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.2 Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time a b 3 6 9 12 15 12.35 12.37
x265 Video Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better x265 Video Input: Bosphorus 1080p a b 20 40 60 80 100 76.04 76.17 1. x265 [info]: HEVC encoder version 3.5+1-f0c1022b6
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16 - Device: CPU a b 3 6 9 12 15 11.82 11.84 MIN: 11.63 / MAX: 24.74 MIN: 11.63 / MAX: 23.46 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
LZ4 Compression Compression Level: 3 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.10 Compression Level: 3 - Decompression Speed a b 900 1800 2700 3600 4500 4031.0 4037.8 1. (CC) gcc options: -O3 -pthread
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16-INT8 - Device: CPU a b 7 14 21 28 35 29.92 29.97 MIN: 29.16 / MAX: 39.35 MIN: 29.26 / MAX: 39.7 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
NAMD Input: STMV with 1,066,628 Atoms OpenBenchmarking.org ns/day, More Is Better NAMD 3.0 Input: STMV with 1,066,628 Atoms a b 0.3555 0.711 1.0665 1.422 1.7775 1.57997 1.57748
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Detection FP16 - Device: CPU a b 50 100 150 200 250 209.35 209.68 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Detection FP16 - Device: CPU a b 9 18 27 36 45 38.19 38.13 MIN: 37.67 / MAX: 53.34 MIN: 37.66 / MAX: 53.54 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
oneDNN Harness: Recurrent Neural Network Inference - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Inference - Engine: CPU a b 100 200 300 400 500 458.25 457.56 MIN: 453.92 MIN: 455.95 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
Laghos Test: Triple Point Problem OpenBenchmarking.org Major Kernels Total Rate, More Is Better Laghos 3.1 Test: Triple Point Problem a b 50 100 150 200 250 219.97 220.28 1. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16-INT8 - Device: CPU a b 200 400 600 800 1000 1068.15 1066.68 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16-INT8 - Device: CPU a b 1500 3000 4500 6000 7500 6810.88 6820.14 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
Cpuminer-Opt Algorithm: Triple SHA-256, Onecoin OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 24.3 Algorithm: Triple SHA-256, Onecoin a b 30K 60K 90K 120K 150K 121450 121600 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lgmp
OpenSSL Algorithm: ChaCha20 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: ChaCha20 a b 40000M 80000M 120000M 160000M 200000M 198651380530 198407362970 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16-INT8 - Device: CPU a b 2 4 6 8 10 8.16 8.15 MIN: 8 / MAX: 14.42 MIN: 8 / MAX: 14.94 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Machine Translation EN To DE FP16 - Device: CPU a b 8 16 24 32 40 36.89 36.93 MIN: 35.02 / MAX: 58.88 MIN: 34.97 / MAX: 57.6 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
C-Ray Resolution: 4K - Rays Per Pixel: 16 OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 2.0 Resolution: 4K - Rays Per Pixel: 16 a b 20 40 60 80 100 74.73 74.65 1. (CC) gcc options: -lpthread -lm
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Barbershop - Compute: CPU-Only a b 80 160 240 320 400 361.09 361.47
Rustls Benchmark: handshake-ticket - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake-ticket - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 a b 300K 600K 900K 1200K 1500K 1342172.28 1340803.72 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Machine Translation EN To DE FP16 - Device: CPU a b 50 100 150 200 250 216.75 216.55 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
ASTC Encoder Preset: Thorough OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Thorough a b 7 14 21 28 35 29.80 29.83 1. (CXX) g++ options: -O3 -flto -pthread
Stress-NG Test: Radix String Sort OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.17.08 Test: Radix String Sort a b 160 320 480 640 800 739.37 738.70 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16 - Device: CPU a b 1000 2000 3000 4000 5000 4519.36 4523.45 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16 - Device: CPU a b 400 800 1200 1600 2000 2032.87 2031.11 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
QuantLib Size: XXS OpenBenchmarking.org tasks/s, More Is Better QuantLib 1.35-dev Size: XXS a b 7 14 21 28 35 28.03 28.05 1. (CXX) g++ options: -O3 -march=native -fPIE -pie
OpenVINO Model: Noise Suppression Poconet-Like FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Noise Suppression Poconet-Like FP16 - Device: CPU a b 3 6 9 12 15 11.82 11.81 MIN: 10.21 / MAX: 38.36 MIN: 10.25 / MAX: 34.11 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
Stress-NG Test: Socket Activity OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.17.08 Test: Socket Activity a b 4K 8K 12K 16K 20K 17142.77 17128.52 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU a b 13K 26K 39K 52K 65K 60277.78 60327.61 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
LZ4 Compression Compression Level: 1 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.10 Compression Level: 1 - Compression Speed a b 150 300 450 600 750 695.86 695.34 1. (CC) gcc options: -O3 -pthread
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16-INT8 - Device: CPU a b 800 1600 2400 3200 4000 3909.50 3912.36 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Classroom - Compute: CPU-Only a b 20 40 60 80 100 95.81 95.74
Cpuminer-Opt Algorithm: Deepcoin OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 24.3 Algorithm: Deepcoin a b 3K 6K 9K 12K 15K 13680 13690 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lgmp
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU a b 160 320 480 640 800 740.54 740.04 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
BYTE Unix Benchmark Computational Test: Whetstone Double OpenBenchmarking.org MWIPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: Whetstone Double a b 110K 220K 330K 440K 550K 500231.6 499902.4 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
uvg266 Video Input: Bosphorus 4K - Video Preset: Slow OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.8.0 Video Input: Bosphorus 4K - Video Preset: Slow a b 4 8 12 16 20 15.47 15.46
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16-INT8 - Device: CPU a b 500 1000 1500 2000 2500 2233.42 2234.86 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
Rustls Benchmark: handshake - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 a b 20K 40K 60K 80K 100K 82026.08 81973.92 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
OpenSSL Algorithm: AES-256-GCM OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: AES-256-GCM a b 70000M 140000M 210000M 280000M 350000M 335590591420 335400128510 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
LZ4 Compression Compression Level: 2 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.10 Compression Level: 2 - Decompression Speed a b 800 1600 2400 3200 4000 3856.4 3854.3 1. (CC) gcc options: -O3 -pthread
ASTC Encoder Preset: Fast OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Fast a b 120 240 360 480 600 534.84 534.56 1. (CXX) g++ options: -O3 -flto -pthread
C-Ray Resolution: 1080p - Rays Per Pixel: 16 OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 2.0 Resolution: 1080p - Rays Per Pixel: 16 a b 5 10 15 20 25 18.84 18.85 1. (CC) gcc options: -lpthread -lm
ASTC Encoder Preset: Very Thorough OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Very Thorough a b 0.9237 1.8474 2.7711 3.6948 4.6185 4.1034 4.1055 1. (CXX) g++ options: -O3 -flto -pthread
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection FP16-INT8 - Device: CPU a b 9 18 27 36 45 39.12 39.10 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection FP16 - Device: CPU a b 5 10 15 20 25 20.83 20.84 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Exhaustive a b 0.5649 1.1298 1.6947 2.2596 2.8245 2.5106 2.5098 1. (CXX) g++ options: -O3 -flto -pthread
OpenVINO Model: Person Re-Identification Retail FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Re-Identification Retail FP16 - Device: CPU a b 500 1000 1500 2000 2500 2362.71 2363.42 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Noise Suppression Poconet-Like FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Noise Suppression Poconet-Like FP16 - Device: CPU a b 600 1200 1800 2400 3000 2676.03 2676.83 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Total OpenBenchmarking.org Mbps, More Is Better srsRAN Project 24.10 Test: PUSCH Processor Benchmark, Throughput Total a b 500 1000 1500 2000 2500 2260.8 2260.2 1. (CXX) g++ options: -O3 -march=native -mtune=generic -fno-trapping-math -fno-math-errno -ldl
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Detection FP32 - Device: CPU a b 9 18 27 36 45 38.20 38.21 MIN: 37.68 / MAX: 52.5 MIN: 37.65 / MAX: 54.36 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
LZ4 Compression Compression Level: 1 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.10 Compression Level: 1 - Decompression Speed a b 1000 2000 3000 4000 5000 4460.4 4459.3 1. (CC) gcc options: -O3 -pthread
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 8 - Input: Bosphorus 4K a b 20 40 60 80 100 91.14 91.17 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Primesieve Length: 1e12 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.6 Length: 1e12 a b 0.934 1.868 2.802 3.736 4.67 4.150 4.151 1. (CXX) g++ options: -O3
Primesieve Length: 1e13 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.6 Length: 1e13 a b 11 22 33 44 55 50.27 50.28 1. (CXX) g++ options: -O3
OpenSSL Algorithm: RSA4096 OpenBenchmarking.org verify/s, More Is Better OpenSSL 3.3 Algorithm: RSA4096 a b 120K 240K 360K 480K 600K 558789.7 558659.8 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
Cpuminer-Opt Algorithm: scrypt OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 24.3 Algorithm: scrypt a b 100 200 300 400 500 482.67 482.56 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lgmp
OpenSSL Algorithm: AES-128-GCM OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: AES-128-GCM a b 80000M 160000M 240000M 320000M 400000M 393314332400 393393574980 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
QuantLib Size: S OpenBenchmarking.org tasks/s, More Is Better QuantLib 1.35-dev Size: S a b 6 12 18 24 30 27.59 27.59 1. (CXX) g++ options: -O3 -march=native -fPIE -pie
Cpuminer-Opt Algorithm: Quad SHA-256, Pyrite OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 24.3 Algorithm: Quad SHA-256, Pyrite a b 20K 40K 60K 80K 100K 102660 102640 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lgmp
BYTE Unix Benchmark Computational Test: Pipe OpenBenchmarking.org LPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: Pipe a b 13M 26M 39M 52M 65M 58878998.0 58867931.7 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
LZ4 Compression Compression Level: 3 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.10 Compression Level: 3 - Compression Speed a b 20 40 60 80 100 107.13 107.11 1. (CC) gcc options: -O3 -pthread
LiteRT Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception ResNet V2 a b 6K 12K 18K 24K 30K 26499.5 26503.6
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Detection FP32 - Device: CPU a b 50 100 150 200 250 209.21 209.24 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection FP16 - Device: CPU a b 80 160 240 320 400 383.43 383.38 MIN: 381.89 / MAX: 409.27 MIN: 381.95 / MAX: 410.36 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenSSL Algorithm: SHA512 OpenBenchmarking.org byte/s, More Is Better OpenSSL Algorithm: SHA512 a b 4000M 8000M 12000M 16000M 20000M 16883923960 16881854110 1. OpenSSL 3.0.10 1 Aug 2023 (Library: OpenSSL 3.0.10 1 Aug 2023)
BYTE Unix Benchmark Computational Test: System Call OpenBenchmarking.org LPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: System Call a b 10M 20M 30M 40M 50M 47042441.2 47036981.1 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
Stress-NG Test: Bitonic Integer Sort OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.17.08 Test: Bitonic Integer Sort a b 80 160 240 320 400 385.60 385.56 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Cpuminer-Opt Algorithm: Blake-2 S OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 24.3 Algorithm: Blake-2 S a b 50K 100K 150K 200K 250K 220790 220810 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lgmp
OpenSSL Algorithm: ChaCha20-Poly1305 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: ChaCha20-Poly1305 a b 30000M 60000M 90000M 120000M 150000M 141481760360 141469554280 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: SHA256 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: SHA256 a b 11000M 22000M 33000M 44000M 55000M 52436660260 52440112330 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: ChaCha20-Poly1305 OpenBenchmarking.org byte/s, More Is Better OpenSSL Algorithm: ChaCha20-Poly1305 a b 30000M 60000M 90000M 120000M 150000M 141514205590 141522365640 1. OpenSSL 3.0.10 1 Aug 2023 (Library: OpenSSL 3.0.10 1 Aug 2023)
Renaissance Test: Finagle HTTP Requests OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Finagle HTTP Requests a b 800 1600 2400 3200 4000 3928.6 3928.4 MAX: 4387.37 MAX: 4504.04
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection FP16-INT8 - Device: CPU a b 40 80 120 160 200 204.27 204.28 MIN: 203.52 / MAX: 215.59 MIN: 203.5 / MAX: 214.35 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
ASTC Encoder Preset: Medium OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Medium a b 50 100 150 200 250 221.82 221.83 1. (CXX) g++ options: -O3 -flto -pthread
OpenSSL Algorithm: AES-256-GCM OpenBenchmarking.org byte/s, More Is Better OpenSSL Algorithm: AES-256-GCM a b 70000M 140000M 210000M 280000M 350000M 334767993920 334753808110 1. OpenSSL 3.0.10 1 Aug 2023 (Library: OpenSSL 3.0.10 1 Aug 2023)
OpenSSL Algorithm: ChaCha20 OpenBenchmarking.org byte/s, More Is Better OpenSSL Algorithm: ChaCha20 a b 40000M 80000M 120000M 160000M 200000M 198630347840 198638149630 1. OpenSSL 3.0.10 1 Aug 2023 (Library: OpenSSL 3.0.10 1 Aug 2023)
C-Ray Resolution: 5K - Rays Per Pixel: 16 OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 2.0 Resolution: 5K - Rays Per Pixel: 16 a b 30 60 90 120 150 132.73 132.73 1. (CC) gcc options: -lpthread -lm
Laghos Test: Sedov Blast Wave, ube_922_hex.mesh OpenBenchmarking.org Major Kernels Total Rate, More Is Better Laghos 3.1 Test: Sedov Blast Wave, ube_922_hex.mesh a b 60 120 180 240 300 284.26 284.27 1. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi
OpenSSL Algorithm: SHA512 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: SHA512 a b 4000M 8000M 12000M 16000M 20000M 16971437080 16971041430 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: RSA4096 OpenBenchmarking.org sign/s, More Is Better OpenSSL 3.3 Algorithm: RSA4096 a b 5K 10K 15K 20K 25K 22944.9 22944.6 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU a b 0.0765 0.153 0.2295 0.306 0.3825 0.34 0.34 MIN: 0.32 / MAX: 19.93 MIN: 0.32 / MAX: 26.97 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU a b 0.117 0.234 0.351 0.468 0.585 0.52 0.52 MIN: 0.5 / MAX: 13.55 MIN: 0.5 / MAX: 10.34 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16 - Device: CPU a b 4 8 12 16 20 15.72 15.72 MIN: 15.18 / MAX: 27.09 MIN: 15.2 / MAX: 24.96 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16-INT8 - Device: CPU a b 0.801 1.602 2.403 3.204 4.005 3.56 3.56 MIN: 3.53 / MAX: 10.84 MIN: 3.52 / MAX: 10.52 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16 - Device: CPU a b 0.396 0.792 1.188 1.584 1.98 1.76 1.76 MIN: 1.73 / MAX: 7.26 MIN: 1.72 / MAX: 7.01 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
Cpuminer-Opt Algorithm: LBC, LBRY Credits OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 24.3 Algorithm: LBC, LBRY Credits a b 5K 10K 15K 20K 25K 24690 24690 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lgmp
Cpuminer-Opt Algorithm: Myriad-Groestl OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 24.3 Algorithm: Myriad-Groestl a b 4K 8K 12K 16K 20K 17760 17760 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lgmp
Cpuminer-Opt Algorithm: Skeincoin OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 24.3 Algorithm: Skeincoin a b 14K 28K 42K 56K 70K 67110 67110 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lgmp
Intel Open Image Denoise Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.3 Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only a b 0.1508 0.3016 0.4524 0.6032 0.754 0.67 0.67
Intel Open Image Denoise Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.3 Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only a b 0.3173 0.6346 0.9519 1.2692 1.5865 1.41 1.41
Intel Open Image Denoise Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.3 Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only a b 0.3173 0.6346 0.9519 1.2692 1.5865 1.41 1.41
WebP Image Encode Encode Settings: Quality 100, Lossless, Highest Compression OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.4 Encode Settings: Quality 100, Lossless, Highest Compression a b 0.1283 0.2566 0.3849 0.5132 0.6415 0.57 0.57 1. (CC) gcc options: -fvisibility=hidden -O2 -lm
WebP Image Encode Encode Settings: Quality 100, Highest Compression OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.4 Encode Settings: Quality 100, Highest Compression a b 0.81 1.62 2.43 3.24 4.05 3.60 3.60 1. (CC) gcc options: -fvisibility=hidden -O2 -lm
OpenVINO GenAI Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time Per Output Token a b 6 12 18 24 30 27.34 27.41
OpenVINO GenAI Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time To First Token a b 13 26 39 52 65 58.48 57.04
OpenVINO GenAI Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time Per Output Token a b 9 18 27 36 45 39.26 37.93
OpenVINO GenAI Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time To First Token a b 20 40 60 80 100 104.07 98.97
OpenVINO GenAI Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU - Time Per Output Token a b 5 10 15 20 25 19.43 19.47
OpenVINO GenAI Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU - Time To First Token a b 5 10 15 20 25 21.71 21.89
OpenVINO GenAI Model: Gemma-7b-int4-ov - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Gemma-7b-int4-ov - Device: CPU - Time Per Output Token a b 11 22 33 44 55 49.41 49.77
OpenVINO GenAI Model: Gemma-7b-int4-ov - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Gemma-7b-int4-ov - Device: CPU - Time To First Token a b 30 60 90 120 150 115.87 114.75
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard a b 7 14 21 28 35 29.63 23.03 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard a b 130 260 390 520 650 598.26 514.10 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: super-resolution-10 - Device: CPU - Executor: Standard a b 3 6 9 12 15 10.03 10.13 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard a b 1.0187 2.0374 3.0561 4.0748 5.0935 4.48901 4.52770 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard a b 8 16 24 32 40 33.68 32.19 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard a b 60 120 180 240 300 272.83 239.83 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard a b 0.3874 0.7748 1.1622 1.5496 1.937 1.71077 1.72194 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: bertsquad-12 - Device: CPU - Executor: Standard a b 16 32 48 64 80 72.21 71.77 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: T5 Encoder - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: T5 Encoder - Device: CPU - Executor: Standard a b 0.8595 1.719 2.5785 3.438 4.2975 3.82017 3.80081 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ZFNet-512 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ZFNet-512 - Device: CPU - Executor: Standard a b 2 4 6 8 10 8.08821 7.97344 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: yolov4 - Device: CPU - Executor: Standard a b 30 60 90 120 150 143.23 116.30 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: GPT-2 - Device: CPU - Executor: Standard a b 2 4 6 8 10 7.20440 7.64339 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
Phoronix Test Suite v10.8.5