genoa tests eoy2024 Benchmarks for a future article. 2 x AMD EPYC 9124 16-Core testing with a AMD Titanite_4G (RTI1007B BIOS) and ASPEED on Ubuntu 23.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2412273-NE-GENOATEST41&grr .
genoa tests eoy2024 Processor Motherboard Chipset Memory Disk Graphics Network OS Kernel Compiler File-System Screen Resolution a b 2 x AMD EPYC 9124 16-Core @ 3.00GHz (32 Cores / 64 Threads) AMD Titanite_4G (RTI1007B BIOS) AMD Device 14a4 1520GB 3201GB Micron_7450_MTFDKCB3T2TFS ASPEED Broadcom NetXtreme BCM5720 PCIe Ubuntu 23.10 6.9.0-060900rc1daily20240327-generic (x86_64) GCC 13.2.0 ext4 1920x1200 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10113e Java Details - OpenJDK Runtime Environment (build 11.0.23+9-post-Ubuntu-1ubuntu123.10.1) Python Details - Python 3.11.6 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
genoa tests eoy2024 whisper-cpp: ggml-medium.en - 2016 State of the Union quantlib: S blender: Barbershop - CPU-Only rustls: handshake-resume - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 rustls: handshake-ticket - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 whisperfile: Medium openssl: RSA4096 openssl: RSA4096 byte: Whetstone Double whisper-cpp: ggml-small.en - 2016 State of the Union build-nodejs: Time To Compile byte: Dhrystone 2 byte: Pipe byte: System Call svt-av1: Preset 3 - Bosphorus 4K cp2k: H20-256 xnnpack: QS8MobileNetV2 xnnpack: FP16MobileNetV3Small xnnpack: FP16MobileNetV3Large xnnpack: FP16MobileNetV2 xnnpack: FP16MobileNetV1 xnnpack: FP32MobileNetV3Small xnnpack: FP32MobileNetV3Large xnnpack: FP32MobileNetV2 xnnpack: FP32MobileNetV1 stockfish: Chess Benchmark ospray: particle_volume/pathtracer/real_time quantlib: XXS openssl: ChaCha20-Poly1305 openssl: AES-256-GCM openssl: AES-128-GCM openssl: ChaCha20 openssl: SHA512 openssl: SHA256 openssl: ChaCha20-Poly1305 openssl: AES-256-GCM openssl: AES-128-GCM openssl: ChaCha20 openssl: SHA512 openssl: SHA256 whisperfile: Small ospray: particle_volume/scivis/real_time svt-av1: Preset 3 - Bosphorus 1080p whisper-cpp: ggml-base.en - 2016 State of the Union c-ray: 5K - 16 rustls: handshake-resume - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 blender: Pabellon Barcelona - CPU-Only rustls: handshake-ticket - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 renaissance: ALS Movie Lens srsran: PUSCH Processor Benchmark, Throughput Total laghos: Sedov Blast Wave, ube_922_hex.mesh ospray: particle_volume/ao/real_time blender: Classroom - CPU-Only vvenc: Bosphorus 4K - Fast renaissance: Akka Unbalanced Cobwebbed Tree simdjson: DistinctUserID simdjson: TopTweet build2: Time To Compile simdjson: PartialTweets svt-av1: Preset 5 - Bosphorus 4K renaissance: In-Memory Database Shootout renaissance: Savina Reactors.IO renaissance: Apache Spark PageRank z3: 2.smt2 onednn: Recurrent Neural Network Training - CPU c-ray: 4K - 16 onednn: Recurrent Neural Network Inference - CPU renaissance: Gaussian Mixture Model rustls: handshake - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 renaissance: Apache Spark Bayes renaissance: Finagle HTTP Requests palabos: 100 renaissance: Rand Forest simdjson: Kostya openvino-genai: Gemma-7b-int4-ov - CPU - Time Per Output Token openvino-genai: Gemma-7b-int4-ov - CPU - Time To First Token openvino-genai: Gemma-7b-int4-ov - CPU renaissance: Genetic Algorithm Using Jenetics + Futures y-cruncher: 5B astcenc: Very Thorough renaissance: Scala Dotty openvino: Face Detection FP16 - CPU openvino: Face Detection FP16 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection FP16-INT8 - CPU onnx: ResNet101_DUC_HDC-12 - CPU - Standard onnx: ResNet101_DUC_HDC-12 - CPU - Standard openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Noise Suppression Poconet-Like FP16 - CPU llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024 astcenc: Exhaustive openvino: Machine Translation EN To DE FP16 - CPU openvino: Machine Translation EN To DE FP16 - CPU onnx: GPT-2 - CPU - Standard onnx: GPT-2 - CPU - Standard onnx: ZFNet-512 - CPU - Standard onnx: ZFNet-512 - CPU - Standard llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024 onnx: fcn-resnet101-11 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Standard openvino: Person Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP32 - CPU openvino: Person Detection FP32 - CPU onnx: bertsquad-12 - CPU - Standard onnx: bertsquad-12 - CPU - Standard onnx: T5 Encoder - CPU - Standard onnx: T5 Encoder - CPU - Standard onnx: yolov4 - CPU - Standard onnx: yolov4 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Standard openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Person Re-Identification Retail FP16 - CPU openvino: Person Re-Identification Retail FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Face Detection Retail FP16 - CPU onnx: CaffeNet 12-int8 - CPU - Standard onnx: CaffeNet 12-int8 - CPU - Standard onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: super-resolution-10 - CPU - Standard onnx: super-resolution-10 - CPU - Standard ospray: gravity_spheres_volume/dim_512/scivis/real_time ospray: gravity_spheres_volume/dim_512/ao/real_time ospray: gravity_spheres_volume/dim_512/pathtracer/real_time simdjson: LargeRand compress-lz4: 12 - Decompression Speed compress-lz4: 12 - Compression Speed namd: STMV with 1,066,628 Atoms blender: Junkshop - CPU-Only palabos: 500 gromacs: water_GMX50_bare palabos: 400 whisperfile: Tiny openvino-genai: Falcon-7b-instruct-int4-ov - CPU - Time Per Output Token openvino-genai: Falcon-7b-instruct-int4-ov - CPU - Time To First Token openvino-genai: Falcon-7b-instruct-int4-ov - CPU litert: Inception V4 litert: Inception ResNet V2 litert: NASNet Mobile litert: Mobilenet Float litert: SqueezeNet litert: DeepLab V3 litert: Quantized COCO SSD MobileNet v1 litert: Mobilenet Quant primesieve: 1e13 laghos: Triple Point Problem blender: Fishy Cat - CPU-Only vvenc: Bosphorus 4K - Faster oidn: RTLightmap.hdr.4096x4096 - CPU-Only build-php: Time To Compile compress-lz4: 9 - Decompression Speed compress-lz4: 9 - Compression Speed svt-av1: Preset 5 - Bosphorus 1080p compress-lz4: 1 - Decompression Speed compress-lz4: 1 - Compression Speed compress-lz4: 2 - Decompression Speed compress-lz4: 2 - Compression Speed compress-lz4: 3 - Decompression Speed compress-lz4: 3 - Compression Speed webp: Quality 100, Lossless, Highest Compression build-eigen: Time To Compile uvg266: Bosphorus 4K - Slow blender: BMW27 - CPU-Only uvg266: Bosphorus 4K - Medium z3: 1.smt2 vvenc: Bosphorus 1080p - Fast namd: ATPase with 327,506 Atoms svt-av1: Preset 8 - Bosphorus 4K cpuminer-opt: x20r cpuminer-opt: Garlicoin cpuminer-opt: LBC, LBRY Credits stress-ng: Radix String Sort stress-ng: Socket Activity stress-ng: CPU Stress stress-ng: Context Switching stress-ng: Bitonic Integer Sort cpuminer-opt: Ringcoin cpuminer-opt: Magi cpuminer-opt: Quad SHA-256, Pyrite cpuminer-opt: Triple SHA-256, Onecoin cpuminer-opt: Deepcoin cpuminer-opt: Myriad-Groestl cpuminer-opt: Skeincoin cpuminer-opt: scrypt cpuminer-opt: Blake-2 S mt-dgemm: Sustained Floating-Point Rate warpx: Plasma Acceleration compress-7zip: Decompression Rating compress-7zip: Compression Rating llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128 warpx: Uniform Plasma llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128 onednn: Deconvolution Batch shapes_1d - CPU x265: Bosphorus 4K openvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPU - Time Per Output Token openvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPU - Time To First Token openvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPU oidn: RT.hdr_alb_nrm.3840x2160 - CPU-Only oidn: RT.ldr_alb_nrm.3840x2160 - CPU-Only openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU - Time Per Output Token openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU - Time To First Token openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU rustls: handshake - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 x265: Bosphorus 4K astcenc: Thorough c-ray: 1080p - 16 webp: Quality 100, Lossless srsran: PDSCH Processor Benchmark, Throughput Total svt-av1: Preset 13 - Bosphorus 4K vvenc: Bosphorus 1080p - Faster svt-av1: Preset 8 - Bosphorus 1080p uvg266: Bosphorus 4K - Very Fast uvg266: Bosphorus 4K - Super Fast onednn: IP Shapes 1D - CPU uvg266: Bosphorus 4K - Ultra Fast uvg266: Bosphorus 1080p - Slow uvg266: Bosphorus 1080p - Medium y-cruncher: 1B astcenc: Fast onednn: IP Shapes 3D - CPU llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128 x265: Bosphorus 1080p astcenc: Medium svt-av1: Preset 13 - Bosphorus 1080p x265: Bosphorus 1080p webp: Quality 100, Highest Compression y-cruncher: 500M onednn: Convolution Batch Shapes Auto - CPU uvg266: Bosphorus 1080p - Very Fast uvg266: Bosphorus 1080p - Super Fast uvg266: Bosphorus 1080p - Ultra Fast primesieve: 1e12 onednn: Deconvolution Batch shapes_3d - CPU webp: Quality 100 webp: Default cassandra: Writes a b 888.00606 27.5863 361.09 1663749.32 1342172.28 343.12909 558789.7 22944.9 500231.6 287.77075 273.332 2867577563.1 58878998 47042441.2 9.588 221.369 3611 3878 5309 3458 2095 4008 5319 3628 2145 100001520 192.693 28.0287 141514205590 334767993920 391651514500 198630347840 16883923960 52088429500 141481760360 335590591420 393314332400 198651380530 16971437080 52436660260 154.79938 12.947 26.323 138.26158 132.726 2683018.02 115.05 1952598.53 25095.1 2260.8 284.26 12.8579 95.81 6.979 18669.0 6.66 7.21 80.398 6.55 31.929 6745.9 10442.4 3766.0 78.124 806.396 74.731 458.249 4435.4 401303.16 260.0 3928.6 292.345 788.0 4.18 49.41 115.87 20.24 2237.2 52.861 4.1034 801.9 383.43 20.83 204.27 39.12 598.257 1.67151 11.82 2676.03 99.46 2.5106 36.89 216.75 7.2044 138.699 8.08821 123.608 112.18 272.833 3.6652 38.19 209.35 38.2 209.21 72.2085 13.848 3.82017 261.652 143.23 6.98157 33.683 29.6861 4.11 1939.89 10.78 740.54 4.67 6810.88 0.34 89464.75 3.38 2362.71 11.82 676.1 29.6269 33.7484 29.92 1068.15 31.85 1003.55 0.52 60277.78 15.72 2032.87 3.56 2233.42 8.16 3909.5 5.14 1547.61 1.76 4519.36 1.71077 584.297 4.48901 222.671 10.0326 99.6682 10.2963 10.5125 12.3457 1.25 4331 12.7 1.57997 52.71 412.575 3.837 397.005 50.45276 39.26 104.07 25.47 23103.8 26499.5 42858.2 2057.94 3102.45 4985.32 3245.22 1747.96 50.267 219.97 48.58 13.228 0.67 44.663 4217.5 36.67 84.194 4460.4 695.86 3856.4 315.74 4031 107.13 0.57 42.209 15.47 36.74 17.14 32.661 19.411 5.17004 91.143 12480 7384.28 24690 739.37 17142.77 83077.92 7693259.49 385.6 5108.3 952.87 102660 121450 13680 17760 67110 482.67 220790 1704.035273 28.40062329 239848 256434 267.01 25.81 19.57102025 27.16 5.80994 26.13 19.43 21.71 51.47 1.41 1.41 27.34 58.48 36.58 82026.08 30.76 29.8005 18.839 1.44 40641.5 184.519 39.997 243.702 38.45 39.26 0.877683 42.08 44.16 48.47 9.23 534.8428 0.724443 74.22 76.04 221.8177 529.371 87.75 3.60 4.485 1.09519 121.3 131.92 143.36 4.15 1.72958 11.30 18.60 884.09875 27.5917 361.47 1681659.16 1340803.72 345.49469 558659.8 22944.6 499902.4 289.01438 274.008 2856294482 58867931.7 47036981.1 9.555 222.385 3547 3694 4990 3358 2040 3963 5500 3733 2101 104118302 191.877 28.0526 141522365640 334753808110 392446885890 198638149630 16881854110 52220542360 141469554280 335400128510 393393574980 198407362970 16971041430 52440112330 150.91319 12.92 26.273 134.45784 132.731 2675450.81 114.62 1965288.44 25588.8 2260.2 284.27 12.9818 95.74 6.928 19497.4 6.56 7.27 81.362 6.62 32.159 7016.4 11336.1 3793.3 77.621 821.069 74.651 457.556 4406.8 387481.69 250.5 3928.4 291.837 713.7 4.26 49.77 114.75 20.09 2264.2 52.084 4.1055 751.0 383.38 20.84 204.28 39.1 514.102 1.94512 11.81 2676.83 110.77 2.5098 36.93 216.55 7.64339 130.761 7.97344 125.379 96.97 239.832 4.16952 38.13 209.68 38.21 209.24 71.771 13.9324 3.80081 263.007 116.302 8.59803 32.1906 31.0624 4.1 1945.3 10.8 740.04 4.66 6820.14 0.34 89738.8 3.37 2363.42 11.84 674.27 23.0254 43.4229 29.97 1066.68 31.78 1005.94 0.52 60327.61 15.72 2031.11 3.56 2234.86 8.15 3912.36 5.13 1551.35 1.76 4523.45 1.72194 580.471 4.5277 220.75 10.1289 98.7203 10.3832 10.4511 12.367 1.26 4310.1 12.65 1.57748 52.82 409.041 3.862 400.261 51.75138 37.93 98.97 26.36 22746.1 26503.6 48856.7 2020.43 3060.9 5073.41 3213.83 1708.44 50.279 220.28 48.71 13.627 0.67 44.321 4226.5 36.78 84.519 4459.3 695.34 3854.3 316.56 4037.8 107.11 0.57 40.787 15.46 36.36 17.11 32.815 19.943 5.13084 91.165 12750 6969.32 24690 738.7 17128.52 83768.42 7764860.59 385.56 5092.89 958.57 102640 121600 13690 17760 67110 482.56 220810 1711.894448 28.81928408 248785 251980 189.81 25.67 19.6050734 27.24 5.70868 25.79 19.47 21.89 51.35 1.41 1.41 27.41 57.04 36.48 81973.92 29.69 29.8279 18.849 1.45 41979.9 166.11 39.205 244.444 38.35 39.55 0.875916 41.85 43.99 48.59 9.166 534.5559 0.718503 73.02 76.17 221.8274 505.636 85.5 3.60 4.502 1.10013 123.34 133.52 141.19 4.151 1.7359 11.42 17.33 OpenBenchmarking.org
Whisper.cpp Model: ggml-medium.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.6.2 Model: ggml-medium.en - Input: 2016 State of the Union a b 200 400 600 800 1000 888.01 884.10 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512vnni
QuantLib Size: S OpenBenchmarking.org tasks/s, More Is Better QuantLib 1.35-dev Size: S a b 6 12 18 24 30 27.59 27.59 1. (CXX) g++ options: -O3 -march=native -fPIE -pie
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Barbershop - Compute: CPU-Only a b 80 160 240 320 400 361.09 361.47
Rustls Benchmark: handshake-resume - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake-resume - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 a b 400K 800K 1200K 1600K 2000K 1663749.32 1681659.16 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
Rustls Benchmark: handshake-ticket - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake-ticket - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 a b 300K 600K 900K 1200K 1500K 1342172.28 1340803.72 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
Whisperfile Model Size: Medium OpenBenchmarking.org Seconds, Fewer Is Better Whisperfile 20Aug24 Model Size: Medium a b 80 160 240 320 400 343.13 345.49
OpenSSL Algorithm: RSA4096 OpenBenchmarking.org verify/s, More Is Better OpenSSL 3.3 Algorithm: RSA4096 a b 120K 240K 360K 480K 600K 558789.7 558659.8 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: RSA4096 OpenBenchmarking.org sign/s, More Is Better OpenSSL 3.3 Algorithm: RSA4096 a b 5K 10K 15K 20K 25K 22944.9 22944.6 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
BYTE Unix Benchmark Computational Test: Whetstone Double OpenBenchmarking.org MWIPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: Whetstone Double a b 110K 220K 330K 440K 550K 500231.6 499902.4 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
Whisper.cpp Model: ggml-small.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.6.2 Model: ggml-small.en - Input: 2016 State of the Union a b 60 120 180 240 300 287.77 289.01 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512vnni
Timed Node.js Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Node.js Compilation 21.7.2 Time To Compile a b 60 120 180 240 300 273.33 274.01
BYTE Unix Benchmark Computational Test: Dhrystone 2 OpenBenchmarking.org LPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: Dhrystone 2 a b 600M 1200M 1800M 2400M 3000M 2867577563.1 2856294482.0 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
BYTE Unix Benchmark Computational Test: Pipe OpenBenchmarking.org LPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: Pipe a b 13M 26M 39M 52M 65M 58878998.0 58867931.7 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
BYTE Unix Benchmark Computational Test: System Call OpenBenchmarking.org LPS, More Is Better BYTE Unix Benchmark 5.1.3-git Computational Test: System Call a b 10M 20M 30M 40M 50M 47042441.2 47036981.1 1. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm
SVT-AV1 Encoder Mode: Preset 3 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 3 - Input: Bosphorus 4K a b 3 6 9 12 15 9.588 9.555 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
CP2K Molecular Dynamics Input: H20-256 OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2024.3 Input: H20-256 a b 50 100 150 200 250 221.37 222.39 1. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
XNNPACK Model: QS8MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: QS8MobileNetV2 a b 800 1600 2400 3200 4000 3611 3547 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Small a b 800 1600 2400 3200 4000 3878 3694 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Large a b 1100 2200 3300 4400 5500 5309 4990 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV2 a b 700 1400 2100 2800 3500 3458 3358 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV1 a b 400 800 1200 1600 2000 2095 2040 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Small a b 900 1800 2700 3600 4500 4008 3963 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Large a b 1200 2400 3600 4800 6000 5319 5500 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV2 a b 800 1600 2400 3200 4000 3628 3733 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV1 a b 500 1000 1500 2000 2500 2145 2101 1. (CXX) g++ options: -O3 -lrt -lm
Stockfish Chess Benchmark OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 17 Chess Benchmark a b 20M 40M 60M 80M 100M 100001520 104118302 1. (CXX) g++ options: -lgcov -m64 -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -funroll-loops -msse -msse3 -mpopcnt -mavx2 -mbmi -mavx512f -mavx512bw -mavx512vnni -mavx512dq -mavx512vl -msse4.1 -mssse3 -msse2 -mbmi2 -flto -flto-partition=one -flto=jobserver
OSPRay Benchmark: particle_volume/pathtracer/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.2 Benchmark: particle_volume/pathtracer/real_time a b 40 80 120 160 200 192.69 191.88
QuantLib Size: XXS OpenBenchmarking.org tasks/s, More Is Better QuantLib 1.35-dev Size: XXS a b 7 14 21 28 35 28.03 28.05 1. (CXX) g++ options: -O3 -march=native -fPIE -pie
OpenSSL Algorithm: ChaCha20-Poly1305 OpenBenchmarking.org byte/s, More Is Better OpenSSL Algorithm: ChaCha20-Poly1305 a b 30000M 60000M 90000M 120000M 150000M 141514205590 141522365640 1. OpenSSL 3.0.10 1 Aug 2023 (Library: OpenSSL 3.0.10 1 Aug 2023)
OpenSSL Algorithm: AES-256-GCM OpenBenchmarking.org byte/s, More Is Better OpenSSL Algorithm: AES-256-GCM a b 70000M 140000M 210000M 280000M 350000M 334767993920 334753808110 1. OpenSSL 3.0.10 1 Aug 2023 (Library: OpenSSL 3.0.10 1 Aug 2023)
OpenSSL Algorithm: AES-128-GCM OpenBenchmarking.org byte/s, More Is Better OpenSSL Algorithm: AES-128-GCM a b 80000M 160000M 240000M 320000M 400000M 391651514500 392446885890 1. OpenSSL 3.0.10 1 Aug 2023 (Library: OpenSSL 3.0.10 1 Aug 2023)
OpenSSL Algorithm: ChaCha20 OpenBenchmarking.org byte/s, More Is Better OpenSSL Algorithm: ChaCha20 a b 40000M 80000M 120000M 160000M 200000M 198630347840 198638149630 1. OpenSSL 3.0.10 1 Aug 2023 (Library: OpenSSL 3.0.10 1 Aug 2023)
OpenSSL Algorithm: SHA512 OpenBenchmarking.org byte/s, More Is Better OpenSSL Algorithm: SHA512 a b 4000M 8000M 12000M 16000M 20000M 16883923960 16881854110 1. OpenSSL 3.0.10 1 Aug 2023 (Library: OpenSSL 3.0.10 1 Aug 2023)
OpenSSL Algorithm: SHA256 OpenBenchmarking.org byte/s, More Is Better OpenSSL Algorithm: SHA256 a b 11000M 22000M 33000M 44000M 55000M 52088429500 52220542360 1. OpenSSL 3.0.10 1 Aug 2023 (Library: OpenSSL 3.0.10 1 Aug 2023)
OpenSSL Algorithm: ChaCha20-Poly1305 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: ChaCha20-Poly1305 a b 30000M 60000M 90000M 120000M 150000M 141481760360 141469554280 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: AES-256-GCM OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: AES-256-GCM a b 70000M 140000M 210000M 280000M 350000M 335590591420 335400128510 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: AES-128-GCM OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: AES-128-GCM a b 80000M 160000M 240000M 320000M 400000M 393314332400 393393574980 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: ChaCha20 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: ChaCha20 a b 40000M 80000M 120000M 160000M 200000M 198651380530 198407362970 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: SHA512 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: SHA512 a b 4000M 8000M 12000M 16000M 20000M 16971437080 16971041430 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: SHA256 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: SHA256 a b 11000M 22000M 33000M 44000M 55000M 52436660260 52440112330 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
Whisperfile Model Size: Small OpenBenchmarking.org Seconds, Fewer Is Better Whisperfile 20Aug24 Model Size: Small a b 30 60 90 120 150 154.80 150.91
OSPRay Benchmark: particle_volume/scivis/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.2 Benchmark: particle_volume/scivis/real_time a b 3 6 9 12 15 12.95 12.92
SVT-AV1 Encoder Mode: Preset 3 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 3 - Input: Bosphorus 1080p a b 6 12 18 24 30 26.32 26.27 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Whisper.cpp Model: ggml-base.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.6.2 Model: ggml-base.en - Input: 2016 State of the Union a b 30 60 90 120 150 138.26 134.46 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512vnni
C-Ray Resolution: 5K - Rays Per Pixel: 16 OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 2.0 Resolution: 5K - Rays Per Pixel: 16 a b 30 60 90 120 150 132.73 132.73 1. (CC) gcc options: -lpthread -lm
Rustls Benchmark: handshake-resume - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake-resume - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 a b 600K 1200K 1800K 2400K 3000K 2683018.02 2675450.81 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
Blender Blend File: Pabellon Barcelona - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Pabellon Barcelona - Compute: CPU-Only a b 30 60 90 120 150 115.05 114.62
Rustls Benchmark: handshake-ticket - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake-ticket - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 a b 400K 800K 1200K 1600K 2000K 1952598.53 1965288.44 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
Renaissance Test: ALS Movie Lens OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: ALS Movie Lens a b 5K 10K 15K 20K 25K 25095.1 25588.8 MIN: 24327.2 MIN: 24309
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Total OpenBenchmarking.org Mbps, More Is Better srsRAN Project 24.10 Test: PUSCH Processor Benchmark, Throughput Total a b 500 1000 1500 2000 2500 2260.8 2260.2 1. (CXX) g++ options: -O3 -march=native -mtune=generic -fno-trapping-math -fno-math-errno -ldl
Laghos Test: Sedov Blast Wave, ube_922_hex.mesh OpenBenchmarking.org Major Kernels Total Rate, More Is Better Laghos 3.1 Test: Sedov Blast Wave, ube_922_hex.mesh a b 60 120 180 240 300 284.26 284.27 1. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi
OSPRay Benchmark: particle_volume/ao/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.2 Benchmark: particle_volume/ao/real_time a b 3 6 9 12 15 12.86 12.98
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Classroom - Compute: CPU-Only a b 20 40 60 80 100 95.81 95.74
VVenC Video Input: Bosphorus 4K - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.13 Video Input: Bosphorus 4K - Video Preset: Fast a b 2 4 6 8 10 6.979 6.928 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
Renaissance Test: Akka Unbalanced Cobwebbed Tree OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Akka Unbalanced Cobwebbed Tree a b 4K 8K 12K 16K 20K 18669.0 19497.4 MIN: 17763.87 / MAX: 19829.6 MIN: 19145.14 / MAX: 19580.86
simdjson Throughput Test: DistinctUserID OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: DistinctUserID a b 2 4 6 8 10 6.66 6.56 1. (CXX) g++ options: -O3 -lrt
simdjson Throughput Test: TopTweet OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: TopTweet a b 2 4 6 8 10 7.21 7.27 1. (CXX) g++ options: -O3 -lrt
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.17 Time To Compile a b 20 40 60 80 100 80.40 81.36
simdjson Throughput Test: PartialTweets OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: PartialTweets a b 2 4 6 8 10 6.55 6.62 1. (CXX) g++ options: -O3 -lrt
SVT-AV1 Encoder Mode: Preset 5 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 5 - Input: Bosphorus 4K a b 7 14 21 28 35 31.93 32.16 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Renaissance Test: In-Memory Database Shootout OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: In-Memory Database Shootout a b 1500 3000 4500 6000 7500 6745.9 7016.4 MIN: 5952.22 / MAX: 7932.15 MIN: 6357.26 / MAX: 8122.82
Renaissance Test: Savina Reactors.IO OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Savina Reactors.IO a b 2K 4K 6K 8K 10K 10442.4 11336.1 MIN: 9259.86 / MAX: 11965.68 MIN: 9682.35 / MAX: 13514.5
Renaissance Test: Apache Spark PageRank OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Apache Spark PageRank a b 800 1600 2400 3200 4000 3766.0 3793.3 MIN: 3285.53 / MAX: 3766.03 MIN: 3259.12 / MAX: 3793.31
Z3 Theorem Prover SMT File: 2.smt2 OpenBenchmarking.org Seconds, Fewer Is Better Z3 Theorem Prover 4.12.1 SMT File: 2.smt2 a b 20 40 60 80 100 78.12 77.62 1. (CXX) g++ options: -lpthread -std=c++17 -fvisibility=hidden -mfpmath=sse -msse -msse2 -O3 -fPIC
oneDNN Harness: Recurrent Neural Network Training - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Training - Engine: CPU a b 200 400 600 800 1000 806.40 821.07 MIN: 803.94 MIN: 801.71 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
C-Ray Resolution: 4K - Rays Per Pixel: 16 OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 2.0 Resolution: 4K - Rays Per Pixel: 16 a b 20 40 60 80 100 74.73 74.65 1. (CC) gcc options: -lpthread -lm
oneDNN Harness: Recurrent Neural Network Inference - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Inference - Engine: CPU a b 100 200 300 400 500 458.25 457.56 MIN: 453.92 MIN: 455.95 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
Renaissance Test: Gaussian Mixture Model OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Gaussian Mixture Model a b 1000 2000 3000 4000 5000 4435.4 4406.8 MIN: 4394.25 / MAX: 4847.19 MIN: 4396.48 / MAX: 4764.38
Rustls Benchmark: handshake - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 a b 90K 180K 270K 360K 450K 401303.16 387481.69 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
Renaissance Test: Apache Spark Bayes OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Apache Spark Bayes a b 60 120 180 240 300 260.0 250.5 MIN: 219.27 / MAX: 423.67 MIN: 217.63 / MAX: 448.45
Renaissance Test: Finagle HTTP Requests OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Finagle HTTP Requests a b 800 1600 2400 3200 4000 3928.6 3928.4 MAX: 4387.37 MAX: 4504.04
Palabos Grid Size: 100 OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 100 a b 60 120 180 240 300 292.35 291.84 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
Renaissance Test: Random Forest OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Random Forest a b 200 400 600 800 1000 788.0 713.7 MIN: 591.22 / MAX: 890.34 MIN: 579.9 / MAX: 908.5
simdjson Throughput Test: Kostya OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: Kostya a b 0.9585 1.917 2.8755 3.834 4.7925 4.18 4.26 1. (CXX) g++ options: -O3 -lrt
OpenVINO GenAI Model: Gemma-7b-int4-ov - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Gemma-7b-int4-ov - Device: CPU - Time Per Output Token a b 11 22 33 44 55 49.41 49.77
OpenVINO GenAI Model: Gemma-7b-int4-ov - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Gemma-7b-int4-ov - Device: CPU - Time To First Token a b 30 60 90 120 150 115.87 114.75
OpenVINO GenAI Model: Gemma-7b-int4-ov - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: Gemma-7b-int4-ov - Device: CPU a b 5 10 15 20 25 20.24 20.09
Renaissance Test: Genetic Algorithm Using Jenetics + Futures OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Genetic Algorithm Using Jenetics + Futures a b 500 1000 1500 2000 2500 2237.2 2264.2 MIN: 1595.98 MIN: 1665.04
Y-Cruncher Pi Digits To Calculate: 5B OpenBenchmarking.org Seconds, Fewer Is Better Y-Cruncher 0.8.5 Pi Digits To Calculate: 5B a b 12 24 36 48 60 52.86 52.08
ASTC Encoder Preset: Very Thorough OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Very Thorough a b 0.9237 1.8474 2.7711 3.6948 4.6185 4.1034 4.1055 1. (CXX) g++ options: -O3 -flto -pthread
Renaissance Test: Scala Dotty OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Scala Dotty a b 200 400 600 800 1000 801.9 751.0 MIN: 655.46 / MAX: 1303.91 MIN: 640.27 / MAX: 1261.2
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection FP16 - Device: CPU a b 80 160 240 320 400 383.43 383.38 MIN: 381.89 / MAX: 409.27 MIN: 381.95 / MAX: 410.36 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection FP16 - Device: CPU a b 5 10 15 20 25 20.83 20.84 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection FP16-INT8 - Device: CPU a b 40 80 120 160 200 204.27 204.28 MIN: 203.52 / MAX: 215.59 MIN: 203.5 / MAX: 214.35 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection FP16-INT8 - Device: CPU a b 9 18 27 36 45 39.12 39.10 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
ONNX Runtime Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard a b 130 260 390 520 650 598.26 514.10 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard a b 0.4377 0.8754 1.3131 1.7508 2.1885 1.67151 1.94512 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenVINO Model: Noise Suppression Poconet-Like FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Noise Suppression Poconet-Like FP16 - Device: CPU a b 3 6 9 12 15 11.82 11.81 MIN: 10.21 / MAX: 38.36 MIN: 10.25 / MAX: 34.11 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Noise Suppression Poconet-Like FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Noise Suppression Poconet-Like FP16 - Device: CPU a b 600 1200 1800 2400 3000 2676.03 2676.83 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 a b 20 40 60 80 100 99.46 110.77 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Exhaustive a b 0.5649 1.1298 1.6947 2.2596 2.8245 2.5106 2.5098 1. (CXX) g++ options: -O3 -flto -pthread
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Machine Translation EN To DE FP16 - Device: CPU a b 8 16 24 32 40 36.89 36.93 MIN: 35.02 / MAX: 58.88 MIN: 34.97 / MAX: 57.6 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Machine Translation EN To DE FP16 - Device: CPU a b 50 100 150 200 250 216.75 216.55 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: GPT-2 - Device: CPU - Executor: Standard a b 2 4 6 8 10 7.20440 7.64339 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: GPT-2 - Device: CPU - Executor: Standard a b 30 60 90 120 150 138.70 130.76 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ZFNet-512 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ZFNet-512 - Device: CPU - Executor: Standard a b 2 4 6 8 10 8.08821 7.97344 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ZFNet-512 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ZFNet-512 - Device: CPU - Executor: Standard a b 30 60 90 120 150 123.61 125.38 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 a b 30 60 90 120 150 112.18 96.97 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard a b 60 120 180 240 300 272.83 239.83 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard a b 0.9381 1.8762 2.8143 3.7524 4.6905 3.66520 4.16952 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Detection FP16 - Device: CPU a b 9 18 27 36 45 38.19 38.13 MIN: 37.67 / MAX: 53.34 MIN: 37.66 / MAX: 53.54 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Detection FP16 - Device: CPU a b 50 100 150 200 250 209.35 209.68 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Detection FP32 - Device: CPU a b 9 18 27 36 45 38.20 38.21 MIN: 37.68 / MAX: 52.5 MIN: 37.65 / MAX: 54.36 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Detection FP32 - Device: CPU a b 50 100 150 200 250 209.21 209.24 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: bertsquad-12 - Device: CPU - Executor: Standard a b 16 32 48 64 80 72.21 71.77 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: bertsquad-12 - Device: CPU - Executor: Standard a b 4 8 12 16 20 13.85 13.93 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: T5 Encoder - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: T5 Encoder - Device: CPU - Executor: Standard a b 0.8595 1.719 2.5785 3.438 4.2975 3.82017 3.80081 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: T5 Encoder - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: T5 Encoder - Device: CPU - Executor: Standard a b 60 120 180 240 300 261.65 263.01 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: yolov4 - Device: CPU - Executor: Standard a b 30 60 90 120 150 143.23 116.30 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: yolov4 - Device: CPU - Executor: Standard a b 2 4 6 8 10 6.98157 8.59803 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard a b 8 16 24 32 40 33.68 32.19 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard a b 7 14 21 28 35 29.69 31.06 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Vehicle Bike Detection FP16 - Device: CPU a b 0.9248 1.8496 2.7744 3.6992 4.624 4.11 4.10 MIN: 4.01 / MAX: 12.01 MIN: 4 / MAX: 11.38 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Vehicle Bike Detection FP16 - Device: CPU a b 400 800 1200 1600 2000 1939.89 1945.30 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU a b 3 6 9 12 15 10.78 10.80 MIN: 10.55 / MAX: 18.51 MIN: 10.54 / MAX: 23.01 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU a b 160 320 480 640 800 740.54 740.04 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16-INT8 - Device: CPU a b 1.0508 2.1016 3.1524 4.2032 5.254 4.67 4.66 MIN: 4.61 / MAX: 13.07 MIN: 4.61 / MAX: 13.06 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16-INT8 - Device: CPU a b 1500 3000 4500 6000 7500 6810.88 6820.14 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU a b 0.0765 0.153 0.2295 0.306 0.3825 0.34 0.34 MIN: 0.32 / MAX: 19.93 MIN: 0.32 / MAX: 26.97 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU a b 20K 40K 60K 80K 100K 89464.75 89738.80 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Person Re-Identification Retail FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Re-Identification Retail FP16 - Device: CPU a b 0.7605 1.521 2.2815 3.042 3.8025 3.38 3.37 MIN: 3.3 / MAX: 10.35 MIN: 3.3 / MAX: 10.12 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Person Re-Identification Retail FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Re-Identification Retail FP16 - Device: CPU a b 500 1000 1500 2000 2500 2362.71 2363.42 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16 - Device: CPU a b 3 6 9 12 15 11.82 11.84 MIN: 11.63 / MAX: 24.74 MIN: 11.63 / MAX: 23.46 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16 - Device: CPU a b 150 300 450 600 750 676.10 674.27 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard a b 7 14 21 28 35 29.63 23.03 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard a b 10 20 30 40 50 33.75 43.42 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16-INT8 - Device: CPU a b 7 14 21 28 35 29.92 29.97 MIN: 29.16 / MAX: 39.35 MIN: 29.26 / MAX: 39.7 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16-INT8 - Device: CPU a b 200 400 600 800 1000 1068.15 1066.68 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16 - Device: CPU a b 7 14 21 28 35 31.85 31.78 MIN: 30.68 / MAX: 39.98 MIN: 30.57 / MAX: 40.79 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16 - Device: CPU a b 200 400 600 800 1000 1003.55 1005.94 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU a b 0.117 0.234 0.351 0.468 0.585 0.52 0.52 MIN: 0.5 / MAX: 13.55 MIN: 0.5 / MAX: 10.34 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU a b 13K 26K 39K 52K 65K 60277.78 60327.61 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16 - Device: CPU a b 4 8 12 16 20 15.72 15.72 MIN: 15.18 / MAX: 27.09 MIN: 15.2 / MAX: 24.96 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16 - Device: CPU a b 400 800 1200 1600 2000 2032.87 2031.11 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16-INT8 - Device: CPU a b 0.801 1.602 2.403 3.204 4.005 3.56 3.56 MIN: 3.53 / MAX: 10.84 MIN: 3.52 / MAX: 10.52 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16-INT8 - Device: CPU a b 500 1000 1500 2000 2500 2233.42 2234.86 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16-INT8 - Device: CPU a b 2 4 6 8 10 8.16 8.15 MIN: 8 / MAX: 14.42 MIN: 8 / MAX: 14.94 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16-INT8 - Device: CPU a b 800 1600 2400 3200 4000 3909.50 3912.36 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16 - Device: CPU a b 1.1565 2.313 3.4695 4.626 5.7825 5.14 5.13 MIN: 5.08 / MAX: 14.02 MIN: 5.06 / MAX: 15.63 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16 - Device: CPU a b 300 600 900 1200 1500 1547.61 1551.35 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16 - Device: CPU a b 0.396 0.792 1.188 1.584 1.98 1.76 1.76 MIN: 1.73 / MAX: 7.26 MIN: 1.72 / MAX: 7.01 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16 - Device: CPU a b 1000 2000 3000 4000 5000 4519.36 4523.45 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard a b 0.3874 0.7748 1.1622 1.5496 1.937 1.71077 1.72194 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard a b 130 260 390 520 650 584.30 580.47 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard a b 1.0187 2.0374 3.0561 4.0748 5.0935 4.48901 4.52770 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard a b 50 100 150 200 250 222.67 220.75 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: super-resolution-10 - Device: CPU - Executor: Standard a b 3 6 9 12 15 10.03 10.13 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: super-resolution-10 - Device: CPU - Executor: Standard a b 20 40 60 80 100 99.67 98.72 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OSPRay Benchmark: gravity_spheres_volume/dim_512/scivis/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.2 Benchmark: gravity_spheres_volume/dim_512/scivis/real_time a b 3 6 9 12 15 10.30 10.38
OSPRay Benchmark: gravity_spheres_volume/dim_512/ao/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.2 Benchmark: gravity_spheres_volume/dim_512/ao/real_time a b 3 6 9 12 15 10.51 10.45
OSPRay Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 3.2 Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time a b 3 6 9 12 15 12.35 12.37
simdjson Throughput Test: LargeRandom OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: LargeRandom a b 0.2835 0.567 0.8505 1.134 1.4175 1.25 1.26 1. (CXX) g++ options: -O3 -lrt
LZ4 Compression Compression Level: 12 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.10 Compression Level: 12 - Decompression Speed a b 900 1800 2700 3600 4500 4331.0 4310.1 1. (CC) gcc options: -O3 -pthread
LZ4 Compression Compression Level: 12 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.10 Compression Level: 12 - Compression Speed a b 3 6 9 12 15 12.70 12.65 1. (CC) gcc options: -O3 -pthread
NAMD Input: STMV with 1,066,628 Atoms OpenBenchmarking.org ns/day, More Is Better NAMD 3.0 Input: STMV with 1,066,628 Atoms a b 0.3555 0.711 1.0665 1.422 1.7775 1.57997 1.57748
Blender Blend File: Junkshop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Junkshop - Compute: CPU-Only a b 12 24 36 48 60 52.71 52.82
Palabos Grid Size: 500 OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 500 a b 90 180 270 360 450 412.58 409.04 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
GROMACS Input: water_GMX50_bare OpenBenchmarking.org Ns Per Day, More Is Better GROMACS Input: water_GMX50_bare a b 0.869 1.738 2.607 3.476 4.345 3.837 3.862 1. GROMACS version: 2023.1-Ubuntu_2023.1_2ubuntu1
Palabos Grid Size: 400 OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 400 a b 90 180 270 360 450 397.01 400.26 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
Whisperfile Model Size: Tiny OpenBenchmarking.org Seconds, Fewer Is Better Whisperfile 20Aug24 Model Size: Tiny a b 12 24 36 48 60 50.45 51.75
OpenVINO GenAI Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time Per Output Token a b 9 18 27 36 45 39.26 37.93
OpenVINO GenAI Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time To First Token a b 20 40 60 80 100 104.07 98.97
OpenVINO GenAI Model: Falcon-7b-instruct-int4-ov - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: Falcon-7b-instruct-int4-ov - Device: CPU a b 6 12 18 24 30 25.47 26.36
LiteRT Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception V4 a b 5K 10K 15K 20K 25K 23103.8 22746.1
LiteRT Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception ResNet V2 a b 6K 12K 18K 24K 30K 26499.5 26503.6
LiteRT Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: NASNet Mobile a b 10K 20K 30K 40K 50K 42858.2 48856.7
LiteRT Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Float a b 400 800 1200 1600 2000 2057.94 2020.43
LiteRT Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: SqueezeNet a b 700 1400 2100 2800 3500 3102.45 3060.90
LiteRT Model: DeepLab V3 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: DeepLab V3 a b 1100 2200 3300 4400 5500 4985.32 5073.41
LiteRT Model: Quantized COCO SSD MobileNet v1 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Quantized COCO SSD MobileNet v1 a b 700 1400 2100 2800 3500 3245.22 3213.83
LiteRT Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Quant a b 400 800 1200 1600 2000 1747.96 1708.44
Primesieve Length: 1e13 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.6 Length: 1e13 a b 11 22 33 44 55 50.27 50.28 1. (CXX) g++ options: -O3
Laghos Test: Triple Point Problem OpenBenchmarking.org Major Kernels Total Rate, More Is Better Laghos 3.1 Test: Triple Point Problem a b 50 100 150 200 250 219.97 220.28 1. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi
Blender Blend File: Fishy Cat - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Fishy Cat - Compute: CPU-Only a b 11 22 33 44 55 48.58 48.71
VVenC Video Input: Bosphorus 4K - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.13 Video Input: Bosphorus 4K - Video Preset: Faster a b 4 8 12 16 20 13.23 13.63 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
Intel Open Image Denoise Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.3 Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only a b 0.1508 0.3016 0.4524 0.6032 0.754 0.67 0.67
Timed PHP Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed PHP Compilation 8.3.4 Time To Compile a b 10 20 30 40 50 44.66 44.32
LZ4 Compression Compression Level: 9 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.10 Compression Level: 9 - Decompression Speed a b 900 1800 2700 3600 4500 4217.5 4226.5 1. (CC) gcc options: -O3 -pthread
LZ4 Compression Compression Level: 9 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.10 Compression Level: 9 - Compression Speed a b 8 16 24 32 40 36.67 36.78 1. (CC) gcc options: -O3 -pthread
SVT-AV1 Encoder Mode: Preset 5 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 5 - Input: Bosphorus 1080p a b 20 40 60 80 100 84.19 84.52 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
LZ4 Compression Compression Level: 1 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.10 Compression Level: 1 - Decompression Speed a b 1000 2000 3000 4000 5000 4460.4 4459.3 1. (CC) gcc options: -O3 -pthread
LZ4 Compression Compression Level: 1 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.10 Compression Level: 1 - Compression Speed a b 150 300 450 600 750 695.86 695.34 1. (CC) gcc options: -O3 -pthread
LZ4 Compression Compression Level: 2 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.10 Compression Level: 2 - Decompression Speed a b 800 1600 2400 3200 4000 3856.4 3854.3 1. (CC) gcc options: -O3 -pthread
LZ4 Compression Compression Level: 2 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.10 Compression Level: 2 - Compression Speed a b 70 140 210 280 350 315.74 316.56 1. (CC) gcc options: -O3 -pthread
LZ4 Compression Compression Level: 3 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.10 Compression Level: 3 - Decompression Speed a b 900 1800 2700 3600 4500 4031.0 4037.8 1. (CC) gcc options: -O3 -pthread
LZ4 Compression Compression Level: 3 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.10 Compression Level: 3 - Compression Speed a b 20 40 60 80 100 107.13 107.11 1. (CC) gcc options: -O3 -pthread
WebP Image Encode Encode Settings: Quality 100, Lossless, Highest Compression OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.4 Encode Settings: Quality 100, Lossless, Highest Compression a b 0.1283 0.2566 0.3849 0.5132 0.6415 0.57 0.57 1. (CC) gcc options: -fvisibility=hidden -O2 -lm
Timed Eigen Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Eigen Compilation 3.4.0 Time To Compile a b 10 20 30 40 50 42.21 40.79
uvg266 Video Input: Bosphorus 4K - Video Preset: Slow OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.8.0 Video Input: Bosphorus 4K - Video Preset: Slow a b 4 8 12 16 20 15.47 15.46
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: BMW27 - Compute: CPU-Only a b 8 16 24 32 40 36.74 36.36
uvg266 Video Input: Bosphorus 4K - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.8.0 Video Input: Bosphorus 4K - Video Preset: Medium a b 4 8 12 16 20 17.14 17.11
Z3 Theorem Prover SMT File: 1.smt2 OpenBenchmarking.org Seconds, Fewer Is Better Z3 Theorem Prover 4.12.1 SMT File: 1.smt2 a b 8 16 24 32 40 32.66 32.82 1. (CXX) g++ options: -lpthread -std=c++17 -fvisibility=hidden -mfpmath=sse -msse -msse2 -O3 -fPIC
VVenC Video Input: Bosphorus 1080p - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.13 Video Input: Bosphorus 1080p - Video Preset: Fast a b 5 10 15 20 25 19.41 19.94 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
NAMD Input: ATPase with 327,506 Atoms OpenBenchmarking.org ns/day, More Is Better NAMD 3.0 Input: ATPase with 327,506 Atoms a b 1.1633 2.3266 3.4899 4.6532 5.8165 5.17004 5.13084
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 8 - Input: Bosphorus 4K a b 20 40 60 80 100 91.14 91.17 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Cpuminer-Opt Algorithm: x20r OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 24.3 Algorithm: x20r a b 3K 6K 9K 12K 15K 12480 12750 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lgmp
Cpuminer-Opt Algorithm: Garlicoin OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 24.3 Algorithm: Garlicoin a b 1600 3200 4800 6400 8000 7384.28 6969.32 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lgmp
Cpuminer-Opt Algorithm: LBC, LBRY Credits OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 24.3 Algorithm: LBC, LBRY Credits a b 5K 10K 15K 20K 25K 24690 24690 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lgmp
Stress-NG Test: Radix String Sort OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.17.08 Test: Radix String Sort a b 160 320 480 640 800 739.37 738.70 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Socket Activity OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.17.08 Test: Socket Activity a b 4K 8K 12K 16K 20K 17142.77 17128.52 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: CPU Stress OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.17.08 Test: CPU Stress a b 20K 40K 60K 80K 100K 83077.92 83768.42 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Context Switching OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.17.08 Test: Context Switching a b 1.7M 3.4M 5.1M 6.8M 8.5M 7693259.49 7764860.59 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Bitonic Integer Sort OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.17.08 Test: Bitonic Integer Sort a b 80 160 240 320 400 385.60 385.56 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Cpuminer-Opt Algorithm: Ringcoin OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 24.3 Algorithm: Ringcoin a b 1100 2200 3300 4400 5500 5108.30 5092.89 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lgmp
Cpuminer-Opt Algorithm: Magi OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 24.3 Algorithm: Magi a b 200 400 600 800 1000 952.87 958.57 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lgmp
Cpuminer-Opt Algorithm: Quad SHA-256, Pyrite OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 24.3 Algorithm: Quad SHA-256, Pyrite a b 20K 40K 60K 80K 100K 102660 102640 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lgmp
Cpuminer-Opt Algorithm: Triple SHA-256, Onecoin OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 24.3 Algorithm: Triple SHA-256, Onecoin a b 30K 60K 90K 120K 150K 121450 121600 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lgmp
Cpuminer-Opt Algorithm: Deepcoin OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 24.3 Algorithm: Deepcoin a b 3K 6K 9K 12K 15K 13680 13690 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lgmp
Cpuminer-Opt Algorithm: Myriad-Groestl OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 24.3 Algorithm: Myriad-Groestl a b 4K 8K 12K 16K 20K 17760 17760 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lgmp
Cpuminer-Opt Algorithm: Skeincoin OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 24.3 Algorithm: Skeincoin a b 14K 28K 42K 56K 70K 67110 67110 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lgmp
Cpuminer-Opt Algorithm: scrypt OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 24.3 Algorithm: scrypt a b 100 200 300 400 500 482.67 482.56 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lgmp
Cpuminer-Opt Algorithm: Blake-2 S OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 24.3 Algorithm: Blake-2 S a b 50K 100K 150K 200K 250K 220790 220810 1. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lgmp
ACES DGEMM Sustained Floating-Point Rate OpenBenchmarking.org GFLOP/s, More Is Better ACES DGEMM 1.0 Sustained Floating-Point Rate a b 400 800 1200 1600 2000 1704.04 1711.89 1. (CC) gcc options: -ffast-math -mavx2 -O3 -fopenmp -lopenblas
WarpX Input: Plasma Acceleration OpenBenchmarking.org Seconds, Fewer Is Better WarpX 24.10 Input: Plasma Acceleration a b 7 14 21 28 35 28.40 28.82 1. (CXX) g++ options: -O3 -lm
7-Zip Compression Test: Decompression Rating OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 24.05 Test: Decompression Rating a b 50K 100K 150K 200K 250K 239848 248785 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
7-Zip Compression Test: Compression Rating OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 24.05 Test: Compression Rating a b 50K 100K 150K 200K 250K 256434 251980 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 a b 60 120 180 240 300 267.01 189.81 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 a b 6 12 18 24 30 25.81 25.67 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
WarpX Input: Uniform Plasma OpenBenchmarking.org Seconds, Fewer Is Better WarpX 24.10 Input: Uniform Plasma a b 5 10 15 20 25 19.57 19.61 1. (CXX) g++ options: -O3 -lm
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 a b 6 12 18 24 30 27.16 27.24 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
oneDNN Harness: Deconvolution Batch shapes_1d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_1d - Engine: CPU a b 1.3072 2.6144 3.9216 5.2288 6.536 5.80994 5.70868 MIN: 3.84 MIN: 5.17 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
x265 Video Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better x265 Video Input: Bosphorus 4K a b 6 12 18 24 30 26.13 25.79 1. x265 [info]: HEVC encoder version 3.5+1-f0c1022b6
OpenVINO GenAI Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU - Time Per Output Token a b 5 10 15 20 25 19.43 19.47
OpenVINO GenAI Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU - Time To First Token a b 5 10 15 20 25 21.71 21.89
OpenVINO GenAI Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU a b 12 24 36 48 60 51.47 51.35
Intel Open Image Denoise Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.3 Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only a b 0.3173 0.6346 0.9519 1.2692 1.5865 1.41 1.41
Intel Open Image Denoise Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.3 Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only a b 0.3173 0.6346 0.9519 1.2692 1.5865 1.41 1.41
OpenVINO GenAI Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time Per Output Token a b 6 12 18 24 30 27.34 27.41
OpenVINO GenAI Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time To First Token a b 13 26 39 52 65 58.48 57.04
OpenVINO GenAI Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU a b 8 16 24 32 40 36.58 36.48
Rustls Benchmark: handshake - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 a b 20K 40K 60K 80K 100K 82026.08 81973.92 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
x265 Video Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better x265 4.1 Video Input: Bosphorus 4K a b 7 14 21 28 35 30.76 29.69 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
ASTC Encoder Preset: Thorough OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Thorough a b 7 14 21 28 35 29.80 29.83 1. (CXX) g++ options: -O3 -flto -pthread
C-Ray Resolution: 1080p - Rays Per Pixel: 16 OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 2.0 Resolution: 1080p - Rays Per Pixel: 16 a b 5 10 15 20 25 18.84 18.85 1. (CC) gcc options: -lpthread -lm
WebP Image Encode Encode Settings: Quality 100, Lossless OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.4 Encode Settings: Quality 100, Lossless a b 0.3263 0.6526 0.9789 1.3052 1.6315 1.44 1.45 1. (CC) gcc options: -fvisibility=hidden -O2 -lm
srsRAN Project Test: PDSCH Processor Benchmark, Throughput Total OpenBenchmarking.org Mbps, More Is Better srsRAN Project 24.10 Test: PDSCH Processor Benchmark, Throughput Total a b 9K 18K 27K 36K 45K 40641.5 41979.9 1. (CXX) g++ options: -O3 -march=native -mtune=generic -fno-trapping-math -fno-math-errno -ldl
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 13 - Input: Bosphorus 4K a b 40 80 120 160 200 184.52 166.11 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
VVenC Video Input: Bosphorus 1080p - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.13 Video Input: Bosphorus 1080p - Video Preset: Faster a b 9 18 27 36 45 40.00 39.21 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 8 - Input: Bosphorus 1080p a b 50 100 150 200 250 243.70 244.44 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
uvg266 Video Input: Bosphorus 4K - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.8.0 Video Input: Bosphorus 4K - Video Preset: Very Fast a b 9 18 27 36 45 38.45 38.35
uvg266 Video Input: Bosphorus 4K - Video Preset: Super Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.8.0 Video Input: Bosphorus 4K - Video Preset: Super Fast a b 9 18 27 36 45 39.26 39.55
oneDNN Harness: IP Shapes 1D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 1D - Engine: CPU a b 0.1975 0.395 0.5925 0.79 0.9875 0.877683 0.875916 MIN: 0.85 MIN: 0.84 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
uvg266 Video Input: Bosphorus 4K - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.8.0 Video Input: Bosphorus 4K - Video Preset: Ultra Fast a b 10 20 30 40 50 42.08 41.85
uvg266 Video Input: Bosphorus 1080p - Video Preset: Slow OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.8.0 Video Input: Bosphorus 1080p - Video Preset: Slow a b 10 20 30 40 50 44.16 43.99
uvg266 Video Input: Bosphorus 1080p - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.8.0 Video Input: Bosphorus 1080p - Video Preset: Medium a b 11 22 33 44 55 48.47 48.59
Y-Cruncher Pi Digits To Calculate: 1B OpenBenchmarking.org Seconds, Fewer Is Better Y-Cruncher 0.8.5 Pi Digits To Calculate: 1B a b 3 6 9 12 15 9.230 9.166
ASTC Encoder Preset: Fast OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Fast a b 120 240 360 480 600 534.84 534.56 1. (CXX) g++ options: -O3 -flto -pthread
oneDNN Harness: IP Shapes 3D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 3D - Engine: CPU a b 0.163 0.326 0.489 0.652 0.815 0.724443 0.718503 MIN: 0.69 MIN: 0.69 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 a b 16 32 48 64 80 74.22 73.02 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
x265 Video Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better x265 Video Input: Bosphorus 1080p a b 20 40 60 80 100 76.04 76.17 1. x265 [info]: HEVC encoder version 3.5+1-f0c1022b6
ASTC Encoder Preset: Medium OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Medium a b 50 100 150 200 250 221.82 221.83 1. (CXX) g++ options: -O3 -flto -pthread
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 13 - Input: Bosphorus 1080p a b 110 220 330 440 550 529.37 505.64 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
x265 Video Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better x265 4.1 Video Input: Bosphorus 1080p a b 20 40 60 80 100 87.75 85.50 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
WebP Image Encode Encode Settings: Quality 100, Highest Compression OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.4 Encode Settings: Quality 100, Highest Compression a b 0.81 1.62 2.43 3.24 4.05 3.60 3.60 1. (CC) gcc options: -fvisibility=hidden -O2 -lm
Y-Cruncher Pi Digits To Calculate: 500M OpenBenchmarking.org Seconds, Fewer Is Better Y-Cruncher 0.8.5 Pi Digits To Calculate: 500M a b 1.013 2.026 3.039 4.052 5.065 4.485 4.502
oneDNN Harness: Convolution Batch Shapes Auto - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Convolution Batch Shapes Auto - Engine: CPU a b 0.2475 0.495 0.7425 0.99 1.2375 1.09519 1.10013 MIN: 1.07 MIN: 1.07 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
uvg266 Video Input: Bosphorus 1080p - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.8.0 Video Input: Bosphorus 1080p - Video Preset: Very Fast a b 30 60 90 120 150 121.30 123.34
uvg266 Video Input: Bosphorus 1080p - Video Preset: Super Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.8.0 Video Input: Bosphorus 1080p - Video Preset: Super Fast a b 30 60 90 120 150 131.92 133.52
uvg266 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.8.0 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast a b 30 60 90 120 150 143.36 141.19
Primesieve Length: 1e12 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.6 Length: 1e12 a b 0.934 1.868 2.802 3.736 4.67 4.150 4.151 1. (CXX) g++ options: -O3
oneDNN Harness: Deconvolution Batch shapes_3d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_3d - Engine: CPU a b 0.3906 0.7812 1.1718 1.5624 1.953 1.72958 1.73590 MIN: 1.72 MIN: 1.72 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
WebP Image Encode Encode Settings: Quality 100 OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.4 Encode Settings: Quality 100 a b 3 6 9 12 15 11.30 11.42 1. (CC) gcc options: -fvisibility=hidden -O2 -lm
WebP Image Encode Encode Settings: Default OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.4 Encode Settings: Default a b 5 10 15 20 25 18.60 17.33 1. (CC) gcc options: -fvisibility=hidden -O2 -lm
Phoronix Test Suite v10.8.5