AMD EPYC 9654 EOY2024 Tests for a future article. 2 x AMD EPYC 9654 96-Core testing with a AMD Titanite_4G (RTI1007B BIOS) and ASPEED on Ubuntu 23.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2412279-NE-AMDEPYC9651&grr .
AMD EPYC 9654 EOY2024 Processor Motherboard Chipset Memory Disk Graphics Network OS Kernel Compiler File-System Screen Resolution a b 2 x AMD EPYC 9654 96-Core @ 2.40GHz (192 Cores / 384 Threads) AMD Titanite_4G (RTI1007B BIOS) AMD Device 14a4 1520GB 3201GB Micron_7450_MTFDKCB3T2TFS ASPEED Broadcom NetXtreme BCM5720 PCIe Ubuntu 23.10 6.9.0-060900rc1daily20240327-generic (x86_64) GCC 13.2.0 ext4 1920x1200 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10113e Java Details - OpenJDK Runtime Environment (build 11.0.23+9-post-Ubuntu-1ubuntu123.10.1) Python Details - Python 3.11.6 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
AMD EPYC 9654 EOY2024 brl-cad: VGR Performance Metric epoch: Cone graph500: 26 graph500: 26 graph500: 26 graph500: 26 rustls: handshake-resume - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 rustls: handshake-ticket - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 openssl: RSA4096 openssl: RSA4096 xnnpack: QS8MobileNetV2 xnnpack: FP16MobileNetV3Small xnnpack: FP16MobileNetV3Large xnnpack: FP16MobileNetV2 xnnpack: FP16MobileNetV1 xnnpack: FP32MobileNetV3Small xnnpack: FP32MobileNetV3Large xnnpack: FP32MobileNetV2 xnnpack: FP32MobileNetV1 srsran: PUSCH Processor Benchmark, Throughput Total palabos: 400 cp2k: H20-256 svt-av1: Preset 3 - Bosphorus 4K palabos: 500 build-linux-kernel: allmodconfig palabos: 1000 quantlib: S openssl: AES-256-GCM openssl: AES-128-GCM openssl: ChaCha20-Poly1305 openssl: SHA512 openssl: ChaCha20-Poly1305 openssl: ChaCha20 openssl: SHA256 palabos: 100 rustls: handshake-ticket - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 stockfish: Chess Benchmark rustls: handshake-resume - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 build-nodejs: Time To Compile cp2k: Fayalite-FIST namd: STMV with 1,066,628 Atoms blender: Barbershop - CPU-Only ospray-studio: 3 - 1080p - 32 - Path Tracer - CPU ospray-studio: 1 - 1080p - 32 - Path Tracer - CPU ospray-studio: 2 - 1080p - 32 - Path Tracer - CPU ospray-studio: 2 - 4K - 32 - Path Tracer - CPU ospray-studio: 1 - 4K - 32 - Path Tracer - CPU simdjson: DistinctUserID simdjson: TopTweet renaissance: Apache Spark PageRank simdjson: PartialTweets laghos: Sedov Blast Wave, ube_922_hex.mesh ospray-studio: 3 - 1080p - 1 - Path Tracer - CPU onednn: Recurrent Neural Network Training - CPU onednn: Recurrent Neural Network Inference - CPU ospray-studio: 2 - 1080p - 1 - Path Tracer - CPU ospray-studio: 1 - 1080p - 1 - Path Tracer - CPU renaissance: Apache Spark Bayes rustls: handshake - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Noise Suppression Poconet-Like FP16 - CPU svt-av1: Preset 5 - Bosphorus 4K build2: Time To Compile openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection FP16-INT8 - CPU ospray-studio: 3 - 4K - 1 - Path Tracer - CPU ospray-studio: 2 - 4K - 1 - Path Tracer - CPU simdjson: Kostya openvino: Machine Translation EN To DE FP16 - CPU openvino: Machine Translation EN To DE FP16 - CPU namd: ATPase with 327,506 Atoms openvino: Person Detection FP16 - CPU openvino: Person Detection FP16 - CPU ospray-studio: 1 - 4K - 1 - Path Tracer - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Person Re-Identification Retail FP16 - CPU openvino: Person Re-Identification Retail FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU litert: NASNet Mobile openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU graphics-magick: Resizing openvino: Face Detection Retail FP16 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16-INT8 - CPU laghos: Triple Point Problem graphics-magick: Noise-Gaussian graphics-magick: Rotate graphics-magick: HWB Color Space graphics-magick: Enhanced graphics-magick: Sharpen graphics-magick: Swirl simdjson: LargeRand litert: Inception ResNet V2 warpx: Uniform Plasma litert: Inception V4 litert: DeepLab V3 litert: Quantized COCO SSD MobileNet v1 litert: Mobilenet Quant litert: Mobilenet Float litert: SqueezeNet compress-7zip: Decompression Rating compress-7zip: Compression Rating quantlib: XXS openvino-genai: Gemma-7b-int4-ov - CPU - Time Per Output Token openvino-genai: Gemma-7b-int4-ov - CPU - Time To First Token openvino-genai: Gemma-7b-int4-ov - CPU webp: Quality 100, Lossless, Highest Compression ospray-studio: 3 - 4K - 32 - Path Tracer - CPU warpx: Plasma Acceleration build-php: Time To Compile gromacs: water_GMX50_bare y-cruncher: 5B cp2k: H20-64 srsran: PDSCH Processor Benchmark, Throughput Total c-ray: 5K - 16 stress-ng: Context Switching stress-ng: Socket Activity stress-ng: Radix String Sort stress-ng: Semaphores stress-ng: CPU Stress stress-ng: Crypto openvino-genai: Falcon-7b-instruct-int4-ov - CPU - Time Per Output Token openvino-genai: Falcon-7b-instruct-int4-ov - CPU - Time To First Token openvino-genai: Falcon-7b-instruct-int4-ov - CPU build-linux-kernel: defconfig blender: Pabellon Barcelona - CPU-Only build-eigen: Time To Compile blender: Classroom - CPU-Only onednn: Deconvolution Batch shapes_1d - CPU svt-av1: Preset 8 - Bosphorus 4K openvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPU - Time Per Output Token openvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPU - Time To First Token openvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPU rustls: handshake - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 astcenc: Very Thorough oidn: RTLightmap.hdr.4096x4096 - CPU-Only astcenc: Exhaustive uvg266: Bosphorus 4K - Slow build-ffmpeg: Time To Compile c-ray: 4K - 16 uvg266: Bosphorus 4K - Medium x265: Bosphorus 4K webp: Quality 100, Lossless openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU - Time Per Output Token openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU - Time To First Token openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU svt-av1: Preset 13 - Bosphorus 4K onednn: IP Shapes 1D - CPU blender: Junkshop - CPU-Only y-cruncher: 1B mt-dgemm: Sustained Floating-Point Rate blender: Fishy Cat - CPU-Only astcenc: Fast primesieve: 1e13 uvg266: Bosphorus 4K - Ultra Fast uvg266: Bosphorus 4K - Super Fast uvg266: Bosphorus 4K - Very Fast blender: BMW27 - CPU-Only oidn: RT.ldr_alb_nrm.3840x2160 - CPU-Only oidn: RT.hdr_alb_nrm.3840x2160 - CPU-Only onednn: IP Shapes 3D - CPU y-cruncher: 500M astcenc: Thorough webp: Quality 100, Highest Compression x265: Bosphorus 1080p onednn: Convolution Batch Shapes Auto - CPU astcenc: Medium c-ray: 1080p - 16 onednn: Deconvolution Batch shapes_3d - CPU webp: Quality 100 webp: Default primesieve: 1e12 a b 5922638 471.09 875608000 602409000 1593590000 1491210000 1655182.63 1331242.71 2720827.9 71859.3 52775 173889 237288 58191 26728 117110 433577 69788 45577 7633.2 487.953 222.296 10.922 523.865 221.596 620.526 112.166 1441924066780 1678499904720 660711095770 73239263390 662197600530 950440659350 239063908950 702.003 1902840.03 454312151 2668332.01 113.933 104.427 5.14602 87.52 6582 5596 5674 22277 22116 6.66 7.18 3357.1 6.66 418.98 209 705.83 768.345 180 177 278.6 396787.89 27.13 6603.69 37.571 65.44 265.07 180.61 820 705 4.25 50.88 942.53 17.56222 60.91 787.3 691 5.08 9423.76 6.27 30463.96 4.15 11541.9 17.2 2786.59 15.06 3183.67 40.36 4735.48 43.99 4356.53 20.67 9176.33 0.36 161304.5 1186260 10.59 17923.31 48 2.29 20824.07 4.69 10202.35 217.80400891 170 112 196 358 263 544 1.26 353192 41.44746595 241705 79532.8 105403 104382 16096.3 25283.7 1135910 752410 116.023 34.08 49.79 29.34 0.57 31282 37.27427155 40.235 12.375 27.569 23.744 130691.7 30.59 35685472.59 26852.29 2379.83 248747680.85 405259.43 5791606.6 22.82 35.4 43.81 29.758 28.03 28.502 23.82 31.7324 124.494 17.61 20.41 56.8 81958.94 17.0296 1.67 10.4783 32.13 18.415 17.616 34.03 35.88 1.45 21.14 29.16 47.29 182.65 1.02958 13.42 8.271 3735.547658 12.67 576.8108 12.649 49.86 49.2 51.21 9.37 3.40 3.41 0.677961 4.412 114.2699 3.61 98.66 0.502239 484.6922 4.928 0.983651 11.26 18.62 1.053 5927856 470.96 813842000 578201000 1628240000 1488380000 1651500.01 1348885.92 2720109.9 71862.6 91761 71969 73274 59395 20029 82114 121623 76110 26076 7190.6 487.364 224.128 11.061 522.463 215.947 620.55 112.284 1444631978260 1677265930100 661001317310 73140553620 661483094290 950646066980 239202941250 695.182 1920689.19 495148838 2641193.17 114.507 105.495 5.13310 87.67 6625 5624 5655 22166 22243 7.02 7.27 3203.0 6.64 419.08 209 705.304 768.514 179 177 281.9 397683.95 27.26 6653.05 39.124 64.785 265.16 180.55 818 709 4.25 51.05 939.49 17.41921 60.6 791.5 695 5.08 9417.61 6.27 30463.69 4.15 11548.65 17.16 2793.9 15.02 3191.83 40.22 4750.95 45.22 4239.38 20.69 9176.35 0.36 159292.51 1167580 10.63 17827.22 47 2.29 20830.76 4.69 10196.6 209.998356317 168 112 195 348 259 529 1.26 305612 41.60080985 245655 100615 25207.6 26668.7 24019.2 27388.1 1136332 744362 116.053 34.15 49.11 29.28 0.57 31052 37.15505391 40.178 12.318 27.405 23.25 129688 30.851 34977675.68 26821.69 2381.41 251733262.16 406176.4 284241670.58 22.94 35.35 43.59 29.81 27.82 28.486 23.7 32.466 116.554 17.79 20.56 56.21 81993.85 17.0395 1.66 10.4757 33.45 18.512 17.5 33.64 36.14 1.46 21.56 29.7 46.38 173.649 1.02654 13.38 8.28 3742.802523 12.59 574.782 12.639 47.42 49.88 50.34 9.47 3.42 3.43 0.67961 4.382 113.6111 3.60 97.8 0.577142 489.4361 4.603 0.994893 11.39 18.58 1.057 OpenBenchmarking.org
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.38.2 VGR Performance Metric a b 1.3M 2.6M 3.9M 5.2M 6.5M 5922638 5927856 1. (CXX) g++ options: -std=c++17 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -ltcl8.6 -lnetpbm -lregex_brl -lz_brl -lassimp -ldl -lm -ltk8.6
Epoch Epoch3D Deck: Cone OpenBenchmarking.org Seconds, Fewer Is Better Epoch 4.19.4 Epoch3D Deck: Cone a b 100 200 300 400 500 471.09 470.96 1. (F9X) gfortran options: -O3 -std=f2003 -Jobj -lsdf -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Graph500 Scale: 26 OpenBenchmarking.org sssp max_TEPS, More Is Better Graph500 3.0 Scale: 26 a b 200M 400M 600M 800M 1000M 875608000 813842000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
Graph500 Scale: 26 OpenBenchmarking.org sssp median_TEPS, More Is Better Graph500 3.0 Scale: 26 a b 130M 260M 390M 520M 650M 602409000 578201000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
Graph500 Scale: 26 OpenBenchmarking.org bfs max_TEPS, More Is Better Graph500 3.0 Scale: 26 a b 300M 600M 900M 1200M 1500M 1593590000 1628240000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
Graph500 Scale: 26 OpenBenchmarking.org bfs median_TEPS, More Is Better Graph500 3.0 Scale: 26 a b 300M 600M 900M 1200M 1500M 1491210000 1488380000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
Rustls Benchmark: handshake-resume - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake-resume - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 a b 400K 800K 1200K 1600K 2000K 1655182.63 1651500.01 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
Rustls Benchmark: handshake-ticket - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake-ticket - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 a b 300K 600K 900K 1200K 1500K 1331242.71 1348885.92 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
OpenSSL Algorithm: RSA4096 OpenBenchmarking.org verify/s, More Is Better OpenSSL 3.3 Algorithm: RSA4096 a b 600K 1200K 1800K 2400K 3000K 2720827.9 2720109.9 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: RSA4096 OpenBenchmarking.org sign/s, More Is Better OpenSSL 3.3 Algorithm: RSA4096 a b 15K 30K 45K 60K 75K 71859.3 71862.6 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
XNNPACK Model: QS8MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: QS8MobileNetV2 a b 20K 40K 60K 80K 100K 52775 91761 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Small a b 40K 80K 120K 160K 200K 173889 71969 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Large a b 50K 100K 150K 200K 250K 237288 73274 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV2 a b 13K 26K 39K 52K 65K 58191 59395 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV1 a b 6K 12K 18K 24K 30K 26728 20029 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Small a b 30K 60K 90K 120K 150K 117110 82114 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Large a b 90K 180K 270K 360K 450K 433577 121623 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV2 a b 16K 32K 48K 64K 80K 69788 76110 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV1 a b 10K 20K 30K 40K 50K 45577 26076 1. (CXX) g++ options: -O3 -lrt -lm
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Total OpenBenchmarking.org Mbps, More Is Better srsRAN Project 24.10 Test: PUSCH Processor Benchmark, Throughput Total a b 1600 3200 4800 6400 8000 7633.2 7190.6 1. (CXX) g++ options: -O3 -march=native -mtune=generic -fno-trapping-math -fno-math-errno -ldl
Palabos Grid Size: 400 OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 400 a b 110 220 330 440 550 487.95 487.36 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
CP2K Molecular Dynamics Input: H20-256 OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2024.3 Input: H20-256 a b 50 100 150 200 250 222.30 224.13 1. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
SVT-AV1 Encoder Mode: Preset 3 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 3 - Input: Bosphorus 4K a b 3 6 9 12 15 10.92 11.06 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Palabos Grid Size: 500 OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 500 a b 110 220 330 440 550 523.87 522.46 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
Timed Linux Kernel Compilation Build: allmodconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.8 Build: allmodconfig a b 50 100 150 200 250 221.60 215.95
Palabos Grid Size: 1000 OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 1000 a b 130 260 390 520 650 620.53 620.55 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
QuantLib Size: S OpenBenchmarking.org tasks/s, More Is Better QuantLib 1.35-dev Size: S a b 30 60 90 120 150 112.17 112.28 1. (CXX) g++ options: -O3 -march=native -fPIE -pie
OpenSSL Algorithm: AES-256-GCM OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: AES-256-GCM a b 300000M 600000M 900000M 1200000M 1500000M 1441924066780 1444631978260 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: AES-128-GCM OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: AES-128-GCM a b 400000M 800000M 1200000M 1600000M 2000000M 1678499904720 1677265930100 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: ChaCha20-Poly1305 OpenBenchmarking.org byte/s, More Is Better OpenSSL Algorithm: ChaCha20-Poly1305 a b 140000M 280000M 420000M 560000M 700000M 660711095770 661001317310 1. OpenSSL 3.0.10 1 Aug 2023 (Library: OpenSSL 3.0.10 1 Aug 2023)
OpenSSL Algorithm: SHA512 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: SHA512 a b 16000M 32000M 48000M 64000M 80000M 73239263390 73140553620 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: ChaCha20-Poly1305 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: ChaCha20-Poly1305 a b 140000M 280000M 420000M 560000M 700000M 662197600530 661483094290 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: ChaCha20 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: ChaCha20 a b 200000M 400000M 600000M 800000M 1000000M 950440659350 950646066980 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: SHA256 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.3 Algorithm: SHA256 a b 50000M 100000M 150000M 200000M 250000M 239063908950 239202941250 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
Palabos Grid Size: 100 OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 100 a b 150 300 450 600 750 702.00 695.18 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
Rustls Benchmark: handshake-ticket - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake-ticket - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 a b 400K 800K 1200K 1600K 2000K 1902840.03 1920689.19 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
Stockfish Chess Benchmark OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 17 Chess Benchmark a b 110M 220M 330M 440M 550M 454312151 495148838 1. (CXX) g++ options: -lgcov -m64 -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -funroll-loops -msse -msse3 -mpopcnt -mavx2 -mbmi -mavx512f -mavx512bw -mavx512vnni -mavx512dq -mavx512vl -msse4.1 -mssse3 -msse2 -mbmi2 -flto -flto-partition=one -flto=jobserver
Rustls Benchmark: handshake-resume - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake-resume - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 a b 600K 1200K 1800K 2400K 3000K 2668332.01 2641193.17 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
Timed Node.js Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Node.js Compilation 21.7.2 Time To Compile a b 30 60 90 120 150 113.93 114.51
CP2K Molecular Dynamics Input: Fayalite-FIST OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2024.3 Input: Fayalite-FIST a b 20 40 60 80 100 104.43 105.50 1. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
NAMD Input: STMV with 1,066,628 Atoms OpenBenchmarking.org ns/day, More Is Better NAMD 3.0 Input: STMV with 1,066,628 Atoms a b 1.1579 2.3158 3.4737 4.6316 5.7895 5.14602 5.13310
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Barbershop - Compute: CPU-Only a b 20 40 60 80 100 87.52 87.67
OSPRay Studio Camera: 3 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU a b 1400 2800 4200 5600 7000 6582 6625
OSPRay Studio Camera: 1 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU a b 1200 2400 3600 4800 6000 5596 5624
OSPRay Studio Camera: 2 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU a b 1200 2400 3600 4800 6000 5674 5655
OSPRay Studio Camera: 2 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU a b 5K 10K 15K 20K 25K 22277 22166
OSPRay Studio Camera: 1 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU a b 5K 10K 15K 20K 25K 22116 22243
simdjson Throughput Test: DistinctUserID OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: DistinctUserID a b 2 4 6 8 10 6.66 7.02 1. (CXX) g++ options: -O3 -lrt
simdjson Throughput Test: TopTweet OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: TopTweet a b 2 4 6 8 10 7.18 7.27 1. (CXX) g++ options: -O3 -lrt
Renaissance Test: Apache Spark PageRank OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Apache Spark PageRank a b 700 1400 2100 2800 3500 3357.1 3203.0 MIN: 2934.39 / MAX: 3357.13 MIN: 2545.35 / MAX: 3387.63
simdjson Throughput Test: PartialTweets OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: PartialTweets a b 2 4 6 8 10 6.66 6.64 1. (CXX) g++ options: -O3 -lrt
Laghos Test: Sedov Blast Wave, ube_922_hex.mesh OpenBenchmarking.org Major Kernels Total Rate, More Is Better Laghos 3.1 Test: Sedov Blast Wave, ube_922_hex.mesh a b 90 180 270 360 450 418.98 419.08 1. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi
OSPRay Studio Camera: 3 - Resolution: 1080p - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 1080p - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU a b 50 100 150 200 250 209 209
oneDNN Harness: Recurrent Neural Network Training - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Training - Engine: CPU a b 150 300 450 600 750 705.83 705.30 MIN: 690.59 MIN: 690.81 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Inference - Engine: CPU a b 170 340 510 680 850 768.35 768.51 MIN: 760.24 MIN: 760.46 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
OSPRay Studio Camera: 2 - Resolution: 1080p - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 1080p - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU a b 40 80 120 160 200 180 179
OSPRay Studio Camera: 1 - Resolution: 1080p - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 1080p - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU a b 40 80 120 160 200 177 177
Renaissance Test: Apache Spark Bayes OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Apache Spark Bayes a b 60 120 180 240 300 278.6 281.9 MIN: 225.49 / MAX: 627.64 MIN: 231.77 / MAX: 518.95
Rustls Benchmark: handshake - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 a b 90K 180K 270K 360K 450K 396787.89 397683.95 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
OpenVINO Model: Noise Suppression Poconet-Like FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Noise Suppression Poconet-Like FP16 - Device: CPU a b 6 12 18 24 30 27.13 27.26 MIN: 11.84 / MAX: 65.19 MIN: 11.82 / MAX: 69.91 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Noise Suppression Poconet-Like FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Noise Suppression Poconet-Like FP16 - Device: CPU a b 1400 2800 4200 5600 7000 6603.69 6653.05 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
SVT-AV1 Encoder Mode: Preset 5 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 5 - Input: Bosphorus 4K a b 9 18 27 36 45 37.57 39.12 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.17 Time To Compile a b 15 30 45 60 75 65.44 64.79
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection FP16-INT8 - Device: CPU a b 60 120 180 240 300 265.07 265.16 MIN: 208.86 / MAX: 287.09 MIN: 208.64 / MAX: 284.93 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection FP16-INT8 - Device: CPU a b 40 80 120 160 200 180.61 180.55 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OSPRay Studio Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU a b 200 400 600 800 1000 820 818
OSPRay Studio Camera: 2 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 2 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU a b 150 300 450 600 750 705 709
simdjson Throughput Test: Kostya OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: Kostya a b 0.9563 1.9126 2.8689 3.8252 4.7815 4.25 4.25 1. (CXX) g++ options: -O3 -lrt
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Machine Translation EN To DE FP16 - Device: CPU a b 12 24 36 48 60 50.88 51.05 MIN: 39.09 / MAX: 115.59 MIN: 43.33 / MAX: 141.5 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Machine Translation EN To DE FP16 - Device: CPU a b 200 400 600 800 1000 942.53 939.49 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
NAMD Input: ATPase with 327,506 Atoms OpenBenchmarking.org ns/day, More Is Better NAMD 3.0 Input: ATPase with 327,506 Atoms a b 4 8 12 16 20 17.56 17.42
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Detection FP16 - Device: CPU a b 14 28 42 56 70 60.91 60.60 MIN: 44.34 / MAX: 103.49 MIN: 47.97 / MAX: 103.7 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Detection FP16 - Device: CPU a b 200 400 600 800 1000 787.3 791.5 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OSPRay Studio Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPU a b 150 300 450 600 750 691 695
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Vehicle Bike Detection FP16 - Device: CPU a b 1.143 2.286 3.429 4.572 5.715 5.08 5.08 MIN: 4.41 / MAX: 25.49 MIN: 4.39 / MAX: 21.92 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Vehicle Bike Detection FP16 - Device: CPU a b 2K 4K 6K 8K 10K 9423.76 9417.61 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16-INT8 - Device: CPU a b 2 4 6 8 10 6.27 6.27 MIN: 4.88 / MAX: 25.84 MIN: 4.86 / MAX: 27.67 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16-INT8 - Device: CPU a b 7K 14K 21K 28K 35K 30463.96 30463.69 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Person Re-Identification Retail FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Re-Identification Retail FP16 - Device: CPU a b 0.9338 1.8676 2.8014 3.7352 4.669 4.15 4.15 MIN: 3.54 / MAX: 18.1 MIN: 3.5 / MAX: 17.94 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Person Re-Identification Retail FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Re-Identification Retail FP16 - Device: CPU a b 2K 4K 6K 8K 10K 11541.90 11548.65 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16 - Device: CPU a b 4 8 12 16 20 17.20 17.16 MIN: 14.75 / MAX: 45.78 MIN: 14.74 / MAX: 41.52 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16 - Device: CPU a b 600 1200 1800 2400 3000 2786.59 2793.90 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU a b 4 8 12 16 20 15.06 15.02 MIN: 13.23 / MAX: 40.32 MIN: 13.11 / MAX: 39.52 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU a b 700 1400 2100 2800 3500 3183.67 3191.83 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16-INT8 - Device: CPU a b 9 18 27 36 45 40.36 40.22 MIN: 31.24 / MAX: 63.48 MIN: 35.66 / MAX: 61.94 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16-INT8 - Device: CPU a b 1000 2000 3000 4000 5000 4735.48 4750.95 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16 - Device: CPU a b 10 20 30 40 50 43.99 45.22 MIN: 33.18 / MAX: 76.74 MIN: 33.79 / MAX: 76.18 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16 - Device: CPU a b 900 1800 2700 3600 4500 4356.53 4239.38 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16 - Device: CPU a b 5 10 15 20 25 20.67 20.69 MIN: 16.41 / MAX: 106.72 MIN: 16.5 / MAX: 74.92 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16 - Device: CPU a b 2K 4K 6K 8K 10K 9176.33 9176.35 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU a b 0.081 0.162 0.243 0.324 0.405 0.36 0.36 MIN: 0.33 / MAX: 50.74 MIN: 0.32 / MAX: 54.77 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU a b 30K 60K 90K 120K 150K 161304.50 159292.51 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
LiteRT Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: NASNet Mobile a b 300K 600K 900K 1200K 1500K 1186260 1167580
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16-INT8 - Device: CPU a b 3 6 9 12 15 10.59 10.63 MIN: 9.05 / MAX: 48.25 MIN: 8.7 / MAX: 43.51 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16-INT8 - Device: CPU a b 4K 8K 12K 16K 20K 17923.31 17827.22 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
GraphicsMagick Operation: Resizing OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.43 Operation: Resizing a b 11 22 33 44 55 48 47 1. (CC) gcc options: -fopenmp -O2 -ltiff -ljbig -lwebpmux -lwebp -ljpeg -lXext -lX11 -lzstd -llzma -lz -lm -lpthread -lgomp
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16 - Device: CPU a b 0.5153 1.0306 1.5459 2.0612 2.5765 2.29 2.29 MIN: 1.88 / MAX: 20.37 MIN: 1.92 / MAX: 19.38 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16 - Device: CPU a b 4K 8K 12K 16K 20K 20824.07 20830.76 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16-INT8 - Device: CPU a b 1.0553 2.1106 3.1659 4.2212 5.2765 4.69 4.69 MIN: 3.76 / MAX: 24.84 MIN: 3.87 / MAX: 26.1 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16-INT8 - Device: CPU a b 2K 4K 6K 8K 10K 10202.35 10196.60 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
Laghos Test: Triple Point Problem OpenBenchmarking.org Major Kernels Total Rate, More Is Better Laghos 3.1 Test: Triple Point Problem a b 50 100 150 200 250 217.80 210.00 1. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi
GraphicsMagick Operation: Noise-Gaussian OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.43 Operation: Noise-Gaussian a b 40 80 120 160 200 170 168 1. (CC) gcc options: -fopenmp -O2 -ltiff -ljbig -lwebpmux -lwebp -ljpeg -lXext -lX11 -lzstd -llzma -lz -lm -lpthread -lgomp
GraphicsMagick Operation: Rotate OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.43 Operation: Rotate a b 30 60 90 120 150 112 112 1. (CC) gcc options: -fopenmp -O2 -ltiff -ljbig -lwebpmux -lwebp -ljpeg -lXext -lX11 -lzstd -llzma -lz -lm -lpthread -lgomp
GraphicsMagick Operation: HWB Color Space OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.43 Operation: HWB Color Space a b 40 80 120 160 200 196 195 1. (CC) gcc options: -fopenmp -O2 -ltiff -ljbig -lwebpmux -lwebp -ljpeg -lXext -lX11 -lzstd -llzma -lz -lm -lpthread -lgomp
GraphicsMagick Operation: Enhanced OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.43 Operation: Enhanced a b 80 160 240 320 400 358 348 1. (CC) gcc options: -fopenmp -O2 -ltiff -ljbig -lwebpmux -lwebp -ljpeg -lXext -lX11 -lzstd -llzma -lz -lm -lpthread -lgomp
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.43 Operation: Sharpen a b 60 120 180 240 300 263 259 1. (CC) gcc options: -fopenmp -O2 -ltiff -ljbig -lwebpmux -lwebp -ljpeg -lXext -lX11 -lzstd -llzma -lz -lm -lpthread -lgomp
GraphicsMagick Operation: Swirl OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.43 Operation: Swirl a b 120 240 360 480 600 544 529 1. (CC) gcc options: -fopenmp -O2 -ltiff -ljbig -lwebpmux -lwebp -ljpeg -lXext -lX11 -lzstd -llzma -lz -lm -lpthread -lgomp
simdjson Throughput Test: LargeRandom OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: LargeRandom a b 0.2835 0.567 0.8505 1.134 1.4175 1.26 1.26 1. (CXX) g++ options: -O3 -lrt
LiteRT Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception ResNet V2 a b 80K 160K 240K 320K 400K 353192 305612
WarpX Input: Uniform Plasma OpenBenchmarking.org Seconds, Fewer Is Better WarpX 24.10 Input: Uniform Plasma a b 9 18 27 36 45 41.45 41.60 1. (CXX) g++ options: -O3 -lm
LiteRT Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception V4 a b 50K 100K 150K 200K 250K 241705 245655
LiteRT Model: DeepLab V3 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: DeepLab V3 a b 20K 40K 60K 80K 100K 79532.8 100615.0
LiteRT Model: Quantized COCO SSD MobileNet v1 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Quantized COCO SSD MobileNet v1 a b 20K 40K 60K 80K 100K 105403.0 25207.6
LiteRT Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Quant a b 20K 40K 60K 80K 100K 104382.0 26668.7
LiteRT Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Float a b 5K 10K 15K 20K 25K 16096.3 24019.2
LiteRT Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: SqueezeNet a b 6K 12K 18K 24K 30K 25283.7 27388.1
7-Zip Compression Test: Decompression Rating OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 24.05 Test: Decompression Rating a b 200K 400K 600K 800K 1000K 1135910 1136332 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
7-Zip Compression Test: Compression Rating OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 24.05 Test: Compression Rating a b 160K 320K 480K 640K 800K 752410 744362 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
QuantLib Size: XXS OpenBenchmarking.org tasks/s, More Is Better QuantLib 1.35-dev Size: XXS a b 30 60 90 120 150 116.02 116.05 1. (CXX) g++ options: -O3 -march=native -fPIE -pie
OpenVINO GenAI Model: Gemma-7b-int4-ov - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Gemma-7b-int4-ov - Device: CPU - Time Per Output Token a b 8 16 24 32 40 34.08 34.15
OpenVINO GenAI Model: Gemma-7b-int4-ov - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Gemma-7b-int4-ov - Device: CPU - Time To First Token a b 11 22 33 44 55 49.79 49.11
OpenVINO GenAI Model: Gemma-7b-int4-ov - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: Gemma-7b-int4-ov - Device: CPU a b 7 14 21 28 35 29.34 29.28
WebP Image Encode Encode Settings: Quality 100, Lossless, Highest Compression OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.4 Encode Settings: Quality 100, Lossless, Highest Compression a b 0.1283 0.2566 0.3849 0.5132 0.6415 0.57 0.57 1. (CC) gcc options: -fvisibility=hidden -O2 -lm
OSPRay Studio Camera: 3 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 1.0 Camera: 3 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPU a b 7K 14K 21K 28K 35K 31282 31052
WarpX Input: Plasma Acceleration OpenBenchmarking.org Seconds, Fewer Is Better WarpX 24.10 Input: Plasma Acceleration a b 9 18 27 36 45 37.27 37.16 1. (CXX) g++ options: -O3 -lm
Timed PHP Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed PHP Compilation 8.3.4 Time To Compile a b 9 18 27 36 45 40.24 40.18
GROMACS Input: water_GMX50_bare OpenBenchmarking.org Ns Per Day, More Is Better GROMACS Input: water_GMX50_bare a b 3 6 9 12 15 12.38 12.32 1. GROMACS version: 2023.1-Ubuntu_2023.1_2ubuntu1
Y-Cruncher Pi Digits To Calculate: 5B OpenBenchmarking.org Seconds, Fewer Is Better Y-Cruncher 0.8.5 Pi Digits To Calculate: 5B a b 6 12 18 24 30 27.57 27.41
CP2K Molecular Dynamics Input: H20-64 OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2024.3 Input: H20-64 a b 6 12 18 24 30 23.74 23.25 1. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
srsRAN Project Test: PDSCH Processor Benchmark, Throughput Total OpenBenchmarking.org Mbps, More Is Better srsRAN Project 24.10 Test: PDSCH Processor Benchmark, Throughput Total a b 30K 60K 90K 120K 150K 130691.7 129688.0 1. (CXX) g++ options: -O3 -march=native -mtune=generic -fno-trapping-math -fno-math-errno -ldl
C-Ray Resolution: 5K - Rays Per Pixel: 16 OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 2.0 Resolution: 5K - Rays Per Pixel: 16 a b 7 14 21 28 35 30.59 30.85 1. (CC) gcc options: -lpthread -lm
Stress-NG Test: Context Switching OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.17.08 Test: Context Switching a b 8M 16M 24M 32M 40M 35685472.59 34977675.68 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Socket Activity OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.17.08 Test: Socket Activity a b 6K 12K 18K 24K 30K 26852.29 26821.69 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Radix String Sort OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.17.08 Test: Radix String Sort a b 500 1000 1500 2000 2500 2379.83 2381.41 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Semaphores OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.17.08 Test: Semaphores a b 50M 100M 150M 200M 250M 248747680.85 251733262.16 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: CPU Stress OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.17.08 Test: CPU Stress a b 90K 180K 270K 360K 450K 405259.43 406176.40 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
Stress-NG Test: Crypto OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.17.08 Test: Crypto a b 60M 120M 180M 240M 300M 5791606.60 284241670.58 1. (CXX) g++ options: -O2 -std=gnu99 -lc -lm
OpenVINO GenAI Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time Per Output Token a b 5 10 15 20 25 22.82 22.94
OpenVINO GenAI Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time To First Token a b 8 16 24 32 40 35.40 35.35
OpenVINO GenAI Model: Falcon-7b-instruct-int4-ov - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: Falcon-7b-instruct-int4-ov - Device: CPU a b 10 20 30 40 50 43.81 43.59
Timed Linux Kernel Compilation Build: defconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.8 Build: defconfig a b 7 14 21 28 35 29.76 29.81
Blender Blend File: Pabellon Barcelona - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Pabellon Barcelona - Compute: CPU-Only a b 7 14 21 28 35 28.03 27.82
Timed Eigen Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Eigen Compilation 3.4.0 Time To Compile a b 7 14 21 28 35 28.50 28.49
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Classroom - Compute: CPU-Only a b 6 12 18 24 30 23.82 23.70
oneDNN Harness: Deconvolution Batch shapes_1d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_1d - Engine: CPU a b 8 16 24 32 40 31.73 32.47 MIN: 29.29 MIN: 30.03 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 8 - Input: Bosphorus 4K a b 30 60 90 120 150 124.49 116.55 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenVINO GenAI Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU - Time Per Output Token a b 4 8 12 16 20 17.61 17.79
OpenVINO GenAI Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU - Time To First Token a b 5 10 15 20 25 20.41 20.56
OpenVINO GenAI Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU a b 13 26 39 52 65 56.80 56.21
Rustls Benchmark: handshake - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 OpenBenchmarking.org handshakes/s, More Is Better Rustls 0.23.17 Benchmark: handshake - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 a b 20K 40K 60K 80K 100K 81958.94 81993.85 1. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs
ASTC Encoder Preset: Very Thorough OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Very Thorough a b 4 8 12 16 20 17.03 17.04 1. (CXX) g++ options: -O3 -flto -pthread
Intel Open Image Denoise Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.3 Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only a b 0.3758 0.7516 1.1274 1.5032 1.879 1.67 1.66
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Exhaustive a b 3 6 9 12 15 10.48 10.48 1. (CXX) g++ options: -O3 -flto -pthread
uvg266 Video Input: Bosphorus 4K - Video Preset: Slow OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.8.0 Video Input: Bosphorus 4K - Video Preset: Slow a b 8 16 24 32 40 32.13 33.45
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 7.0 Time To Compile a b 5 10 15 20 25 18.42 18.51
C-Ray Resolution: 4K - Rays Per Pixel: 16 OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 2.0 Resolution: 4K - Rays Per Pixel: 16 a b 4 8 12 16 20 17.62 17.50 1. (CC) gcc options: -lpthread -lm
uvg266 Video Input: Bosphorus 4K - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.8.0 Video Input: Bosphorus 4K - Video Preset: Medium a b 8 16 24 32 40 34.03 33.64
x265 Video Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better x265 4.1 Video Input: Bosphorus 4K a b 8 16 24 32 40 35.88 36.14 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
WebP Image Encode Encode Settings: Quality 100, Lossless OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.4 Encode Settings: Quality 100, Lossless a b 0.3285 0.657 0.9855 1.314 1.6425 1.45 1.46 1. (CC) gcc options: -fvisibility=hidden -O2 -lm
OpenVINO GenAI Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time Per Output Token a b 5 10 15 20 25 21.14 21.56
OpenVINO GenAI Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time To First Token a b 7 14 21 28 35 29.16 29.70
OpenVINO GenAI Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU a b 11 22 33 44 55 47.29 46.38
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 13 - Input: Bosphorus 4K a b 40 80 120 160 200 182.65 173.65 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
oneDNN Harness: IP Shapes 1D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 1D - Engine: CPU a b 0.2317 0.4634 0.6951 0.9268 1.1585 1.02958 1.02654 MIN: 0.93 MIN: 0.95 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
Blender Blend File: Junkshop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Junkshop - Compute: CPU-Only a b 3 6 9 12 15 13.42 13.38
Y-Cruncher Pi Digits To Calculate: 1B OpenBenchmarking.org Seconds, Fewer Is Better Y-Cruncher 0.8.5 Pi Digits To Calculate: 1B a b 2 4 6 8 10 8.271 8.280
ACES DGEMM Sustained Floating-Point Rate OpenBenchmarking.org GFLOP/s, More Is Better ACES DGEMM 1.0 Sustained Floating-Point Rate a b 800 1600 2400 3200 4000 3735.55 3742.80 1. (CC) gcc options: -ffast-math -mavx2 -O3 -fopenmp -lopenblas
Blender Blend File: Fishy Cat - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Fishy Cat - Compute: CPU-Only a b 3 6 9 12 15 12.67 12.59
ASTC Encoder Preset: Fast OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Fast a b 120 240 360 480 600 576.81 574.78 1. (CXX) g++ options: -O3 -flto -pthread
Primesieve Length: 1e13 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.6 Length: 1e13 a b 3 6 9 12 15 12.65 12.64 1. (CXX) g++ options: -O3
uvg266 Video Input: Bosphorus 4K - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.8.0 Video Input: Bosphorus 4K - Video Preset: Ultra Fast a b 11 22 33 44 55 49.86 47.42
uvg266 Video Input: Bosphorus 4K - Video Preset: Super Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.8.0 Video Input: Bosphorus 4K - Video Preset: Super Fast a b 11 22 33 44 55 49.20 49.88
uvg266 Video Input: Bosphorus 4K - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.8.0 Video Input: Bosphorus 4K - Video Preset: Very Fast a b 12 24 36 48 60 51.21 50.34
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: BMW27 - Compute: CPU-Only a b 3 6 9 12 15 9.37 9.47
Intel Open Image Denoise Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.3 Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only a b 0.7695 1.539 2.3085 3.078 3.8475 3.40 3.42
Intel Open Image Denoise Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.3 Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only a b 0.7718 1.5436 2.3154 3.0872 3.859 3.41 3.43
oneDNN Harness: IP Shapes 3D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 3D - Engine: CPU a b 0.1529 0.3058 0.4587 0.6116 0.7645 0.677961 0.679610 MIN: 0.64 MIN: 0.65 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
Y-Cruncher Pi Digits To Calculate: 500M OpenBenchmarking.org Seconds, Fewer Is Better Y-Cruncher 0.8.5 Pi Digits To Calculate: 500M a b 0.9927 1.9854 2.9781 3.9708 4.9635 4.412 4.382
ASTC Encoder Preset: Thorough OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Thorough a b 30 60 90 120 150 114.27 113.61 1. (CXX) g++ options: -O3 -flto -pthread
WebP Image Encode Encode Settings: Quality 100, Highest Compression OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.4 Encode Settings: Quality 100, Highest Compression a b 0.8123 1.6246 2.4369 3.2492 4.0615 3.61 3.60 1. (CC) gcc options: -fvisibility=hidden -O2 -lm
x265 Video Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better x265 4.1 Video Input: Bosphorus 1080p a b 20 40 60 80 100 98.66 97.80 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
oneDNN Harness: Convolution Batch Shapes Auto - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Convolution Batch Shapes Auto - Engine: CPU a b 0.1299 0.2598 0.3897 0.5196 0.6495 0.502239 0.577142 MIN: 0.43 MIN: 0.49 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
ASTC Encoder Preset: Medium OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Medium a b 110 220 330 440 550 484.69 489.44 1. (CXX) g++ options: -O3 -flto -pthread
C-Ray Resolution: 1080p - Rays Per Pixel: 16 OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 2.0 Resolution: 1080p - Rays Per Pixel: 16 a b 1.1088 2.2176 3.3264 4.4352 5.544 4.928 4.603 1. (CC) gcc options: -lpthread -lm
oneDNN Harness: Deconvolution Batch shapes_3d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_3d - Engine: CPU a b 0.2239 0.4478 0.6717 0.8956 1.1195 0.983651 0.994893 MIN: 0.91 MIN: 0.89 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
WebP Image Encode Encode Settings: Quality 100 OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.4 Encode Settings: Quality 100 a b 3 6 9 12 15 11.26 11.39 1. (CC) gcc options: -fvisibility=hidden -O2 -lm
WebP Image Encode Encode Settings: Default OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.4 Encode Settings: Default a b 5 10 15 20 25 18.62 18.58 1. (CC) gcc options: -fvisibility=hidden -O2 -lm
Primesieve Length: 1e12 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.6 Length: 1e12 a b 0.2378 0.4756 0.7134 0.9512 1.189 1.053 1.057 1. (CXX) g++ options: -O3
Phoronix Test Suite v10.8.5