eoy2024

AMD EPYC 4564P 16-Core testing with a Supermicro AS-3015A-I H13SAE-MF v1.00 (2.1 BIOS) and ASPEED on Ubuntu 24.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2412061-NE-EOY20243073&grs&sro.

eoy2024ProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerCompilerFile-SystemScreen ResolutionabcAMD EPYC 4564P 16-Core @ 5.88GHz (16 Cores / 32 Threads)Supermicro AS-3015A-I H13SAE-MF v1.00 (2.1 BIOS)AMD Device 14d82 x 32GB DRAM-4800MT/s Micron MTC20C2085S1EC48BA1 BC3201GB Micron_7450_MTFDKCC3T2TFS + 960GB SAMSUNG MZ1L2960HCJR-00A07ASPEEDAMD Rembrandt Radeon HD AudioVA24312 x Intel I210Ubuntu 24.046.8.0-11-generic (x86_64)GNOME Shell 45.3X Server 1.21.1.11GCC 13.2.0ext41024x768OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-fxIygj/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-fxIygj/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: amd-pstate-epp performance (EPP: performance) - CPU Microcode: 0xa601209Java Details- OpenJDK Runtime Environment (build 21.0.2+13-Ubuntu-2)Python Details- Python 3.12.3Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

eoy2024litert: Quantized COCO SSD MobileNet v1litert: NASNet Mobilelitert: DeepLab V3litert: Mobilenet Quantcp2k: Fayalite-FISTllama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512stockfish: Chess Benchmarkrelion: Basic - CPUlitert: Inception V4llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024renaissance: Apache Spark Bayesllama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024litert: Mobilenet Floatrenaissance: In-Memory Database Shootoutrenaissance: Scala Dottycp2k: H20-256renaissance: Rand Forestllama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048rustls: handshake - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384simdjson: LargeRandgcrypt: xnnpack: FP16MobileNetV2onednn: Deconvolution Batch shapes_1d - CPUlitert: Inception ResNet V2xnnpack: FP32MobileNetV2simdjson: Kostyapyperformance: asyncio_tcp_sslllamafile: Llama-3.2-3B-Instruct.Q6_K - Text Generation 16xnnpack: FP32MobileNetV3Largelitert: SqueezeNetllama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048renaissance: Genetic Algorithm Using Jenetics + Futuresxnnpack: FP16MobileNetV3Largesimdjson: TopTweetllama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128xnnpack: FP32MobileNetV1renaissance: Gaussian Mixture Modelxnnpack: FP16MobileNetV1xnnpack: FP32MobileNetV3Smallrenaissance: Savina Reactors.IOrenaissance: Akka Unbalanced Cobwebbed Treesvt-av1: Preset 8 - Bosphorus 1080psvt-av1: Preset 8 - Bosphorus 4Konednn: IP Shapes 3D - CPUrenaissance: Finagle HTTP Requestsonednn: IP Shapes 1D - CPUllama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048cp2k: H20-64onednn: Convolution Batch Shapes Auto - CPUx265: Bosphorus 4Ksvt-av1: Preset 13 - Bosphorus 1080psvt-av1: Preset 5 - Beauty 4K 10-bitbuild-eigen: Time To Compileonednn: Deconvolution Batch shapes_3d - CPUrustls: handshake-resume - TLS13_CHACHA20_POLY1305_SHA256whisper-cpp: ggml-small.en - 2016 State of the Unionrustls: handshake-ticket - TLS13_CHACHA20_POLY1305_SHA256onnx: bertsquad-12 - CPU - Standardllamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Text Generation 128rustls: handshake - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256rustls: handshake-resume - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256stockfish: Chess Benchmarkpovray: Trace Timellamafile: mistral-7b-instruct-v0.2.Q5_K_M - Text Generation 128llamafile: Llama-3.2-3B-Instruct.Q6_K - Text Generation 128renaissance: ALS Movie Lenssvt-av1: Preset 13 - Bosphorus 4Konednn: Recurrent Neural Network Inference - CPUonnx: ZFNet-512 - CPU - Standardx265: Bosphorus 1080ponnx: fcn-resnet101-11 - CPU - Standardsimdjson: PartialTweetswhisperfile: Smallonnx: Faster R-CNN R-50-FPN-int8 - CPU - Standardnamd: ATPase with 327,506 Atomsonnx: ResNet101_DUC_HDC-12 - CPU - Standardcouchdb: 100 - 3000 - 30numpy: pyperformance: chaosmt-dgemm: Sustained Floating-Point Ratesvt-av1: Preset 5 - Bosphorus 1080ppyperformance: floatxnnpack: FP16MobileNetV3Smallrustls: handshake-ticket - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256xnnpack: QS8MobileNetV2whisperfile: Tinyonnx: ArcFace ResNet-100 - CPU - Standardrenaissance: Apache Spark PageRanksvt-av1: Preset 8 - Beauty 4K 10-bitpyperformance: raytracerustls: handshake-ticket - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384financebench: Bonds OpenMPonnx: CaffeNet 12-int8 - CPU - Standardcouchdb: 500 - 3000 - 30onnx: ResNet50 v1-12-int8 - CPU - Standardpyperformance: gosvt-av1: Preset 3 - Bosphorus 4Kcouchdb: 300 - 1000 - 30llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512blender: Junkshop - CPU-Onlysvt-av1: Preset 5 - Bosphorus 4Kospray: gravity_spheres_volume/dim_512/ao/real_timeonnx: yolov4 - CPU - Standardcompress-7zip: Decompression Ratingonednn: Recurrent Neural Network Training - CPUopenvino-genai: Gemma-7b-int4-ov - CPUsvt-av1: Preset 13 - Beauty 4K 10-bitospray: gravity_spheres_volume/dim_512/scivis/real_timesvt-av1: Preset 3 - Beauty 4K 10-bitgromacs: water_GMX50_baresimdjson: DistinctUserIDblender: Classroom - CPU-Onlypyperformance: pathlibllama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128astcenc: Exhaustiveastcenc: Thoroughetcpak: Multi-Threaded - ETC2pyperformance: nbodysvt-av1: Preset 3 - Bosphorus 1080pllama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024couchdb: 500 - 1000 - 30astcenc: Mediumblender: Barbershop - CPU-Onlypyperformance: pickle_pure_pythonastcenc: Very Thoroughpyperformance: gc_collectospray: particle_volume/scivis/real_timeprimesieve: 1e13pyperformance: regex_compilellamafile: wizardcoder-python-34b-v1.0.Q6_K - Text Generation 16ospray: particle_volume/pathtracer/real_timepyperformance: async_tree_iollamafile: wizardcoder-python-34b-v1.0.Q6_K - Text Generation 128blender: Fishy Cat - CPU-Onlyprimesieve: 1e12rustls: handshake - TLS13_CHACHA20_POLY1305_SHA256financebench: Repo OpenMPpyperformance: django_templateospray: particle_volume/ao/real_timebyte: Dhrystone 2quantlib: XXSy-cruncher: 1Bllama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPUllamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Text Generation 16whisperfile: Mediumbyte: Pipellamafile: mistral-7b-instruct-v0.2.Q5_K_M - Text Generation 16blender: BMW27 - CPU-Onlyopenssl: AES-128-GCMopenssl: AES-256-GCMpyperformance: python_startupospray: gravity_spheres_volume/dim_512/pathtracer/real_timewhisper-cpp: ggml-medium.en - 2016 State of the Unionpyperformance: asyncio_websocketsopenvino-genai: Falcon-7b-instruct-int4-ov - CPUquantlib: Spyperformance: xml_etreecompress-7zip: Compression Ratingcouchdb: 100 - 1000 - 30build2: Time To Compilebyte: System Cally-cruncher: 500Mwhisper-cpp: ggml-base.en - 2016 State of the Uniononnx: T5 Encoder - CPU - Standardpyperformance: crypto_pyaesnamd: STMV with 1,066,628 Atomscouchdb: 300 - 3000 - 30onnx: GPT-2 - CPU - Standardopenssl: ChaCha20-Poly1305openssl: ChaCha20byte: Whetstone Doubleonnx: super-resolution-10 - CPU - Standardblender: Pabellon Barcelona - CPU-Onlyastcenc: Fastrustls: handshake-resume - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384cassandra: Writesllamafile: wizardcoder-python-34b-v1.0.Q6_K - Prompt Processing 2048llamafile: wizardcoder-python-34b-v1.0.Q6_K - Prompt Processing 1024llamafile: wizardcoder-python-34b-v1.0.Q6_K - Prompt Processing 512llamafile: wizardcoder-python-34b-v1.0.Q6_K - Prompt Processing 256llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Prompt Processing 2048llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Prompt Processing 1024llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Prompt Processing 512llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Prompt Processing 256llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 2048llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 1024llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 512llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 256llamafile: Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 2048llamafile: Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 1024llamafile: Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 512llamafile: Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 256pyperformance: json_loadsopenvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU - Time Per Output Tokenopenvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU - Time To First Tokenopenvino-genai: Falcon-7b-instruct-int4-ov - CPU - Time Per Output Tokenopenvino-genai: Falcon-7b-instruct-int4-ov - CPU - Time To First Tokenopenvino-genai: Gemma-7b-int4-ov - CPU - Time Per Output Tokenopenvino-genai: Gemma-7b-int4-ov - CPU - Time To First Tokenonnx: Faster R-CNN R-50-FPN-int8 - CPU - Standardonnx: ResNet101_DUC_HDC-12 - CPU - Standardonnx: super-resolution-10 - CPU - Standardonnx: ResNet50 v1-12-int8 - CPU - Standardonnx: ArcFace ResNet-100 - CPU - Standardonnx: fcn-resnet101-11 - CPU - Standardonnx: CaffeNet 12-int8 - CPU - Standardonnx: bertsquad-12 - CPU - Standardonnx: T5 Encoder - CPU - Standardonnx: ZFNet-512 - CPU - Standardonnx: yolov4 - CPU - Standardonnx: GPT-2 - CPU - Standardrenaissance: Apache Spark ALSabc2129.52169363579.67823.1794.03268.454752796944.2721477.8355.09490.070.7670.851211.483256.1477.0592.857414.463.09423535.681.83162.12511902.9761219530.214955.9764519.0318101794.1162.97732.8149810.4647.7212523399.511439793506.44403.8339.023102.0054.0582319.41.12573279.0458.1916.6728732.57842.5586.50458.6552.41294388077.69245.07838404263.4515.589926.2880462.63563852.574650703818.54210.4720.139805.7212.52700.859102.331114.453.21679.76195.4164247.06912.796321.54196232.188775.7538.21141.194104101.97150.7920262033284441.7093542.45372412.212.4681751553632.1433061.21875636.318511.775390.59777.89.59106.13327.373.5634.5387.6394411.05521659161372.039.8318.5887.587891.4221.69210.46143.3614.27.241.684420.3025577.8175929.57369.26148.049156.2217506.21652.7416778.9848678.49869.81.78236.2457551.9971.356.34776454.4521418.44531220.79.009171866536062.713.43218.4856.8819.2824.59534.91948806257.110.2253.55104784522170971727517005.778.82093700.9131512.9312.747635.816385969.92992.05349140426.68.77287.48973156.45341.70.75656367.83134.59692393529340130588495050343491.9141.117166.12396.64951820810.212713331228861443072153632768163848192409632768163848192409632768163848192409612.151.8655.9377.3486.06101.72106.6221.2429648.5227.086012.5589823.553310.8751.5708464.1416.391129.7698590.45237.427762958.4821468.74287.06933.176102.41861.6259130265867.31523265.4328.47529.575.96661295.513081.5447.0629.557398.159.84402625.061.81154.5312473.112620375.715595.9367218.3218771860.3565.27744.3154910.846.2812903494.8117410053594.34439.9330.8799.5544.156822264.71.15274285.7157.3476.8175432.04824.8086.37159.8732.46279380493.86240.59909397022.415.318225.8379085.83504511.314575174718.84610.6419.819958.3209.773711.433103.862112.853.17059.82192.6780847.72932.790251.52109235.345765.3538.71137.394602100.89350.19312589637.9285442.2004941.96392439.912.6111771536355.933432.636719629.655517.149386.577779.554107.182324.2174.2634.2257.5740810.96031668431383.649.9118.4427.528751.4151.67910.38144.4114.37.191.672820.1644575.0259.429.37568.8149.028155.2665509.31662.72486818.9324578.94970.21.79235.325759271.76.37876083.7321522.06640620.88.966321857795366.113.429218.4036.8519.224.69532.8074448718087.110.2653.75104404347840968217370605.798.79096703.2218831612.9712.709835.716405070.11492.29249062324.18.79487.27256156.83341.80.75634368.664134.31192216350580130359884190343113141.003166.25396.42611821261.882713731228861443072153632768163848192409632768163848192409632768163848192409612.152.156.2677.1384.39100.94107.0320.9488657.427.091722.5858923.8276315.4051.5874665.27876.375669.625991.23597.4434105.22153623108939.897500.33046.8458.5624.571420.81.745.73719.110.793472.43567.84331.7338.653100.8932296.658.64732.73838.1686.3749907.4212.945114.529.682.829251127.270287102.1092439.212.5979.49534.4487.6428216732118.5567.557911.41110.43573.91429.4658.97005234.9718.985861862548305.413.49148613927.98.8119912.724216431349016743.60.75813343187OpenBenchmarking.org

LiteRT

Model: Quantized COCO SSD MobileNet v1

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: Quantized COCO SSD MobileNet v1ab60012001800240030002129.522958.48

LiteRT

Model: NASNet Mobile

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: NASNet Mobileab5K10K15K20K25K16936.021468.7

LiteRT

Model: DeepLab V3

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: DeepLab V3ab90018002700360045003579.674287.06

LiteRT

Model: Mobilenet Quant

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: Mobilenet Quantab2004006008001000823.17933.18

CP2K Molecular Dynamics

Input: Fayalite-FIST

OpenBenchmarking.orgSeconds, Fewer Is BetterCP2K Molecular Dynamics 2024.3Input: Fayalite-FISTabc2040608010094.03102.42105.221. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512ab153045607568.4061.621. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

Stockfish

Chess Benchmark

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 17Chess Benchmarkabc13M26M39M52M65M5475279659130265536231081. (CXX) g++ options: -lgcov -m64 -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -funroll-loops -msse -msse3 -mpopcnt -mavx2 -mbmi -mavx512f -mavx512bw -mavx512vnni -mavx512dq -mavx512vl -msse4.1 -mssse3 -msse2 -mbmi2 -flto -flto-partition=one -flto=jobserver

RELION

Test: Basic - Device: CPU

OpenBenchmarking.orgSeconds, Fewer Is BetterRELION 5.0Test: Basic - Device: CPUabc2004006008001000944.27867.32939.901. (CXX) g++ options: -fPIC -std=c++14 -fopenmp -O3 -rdynamic -lfftw3f -lfftw3 -ldl -ltiff -lpng -ljpeg -lmpi_cxx -lmpi

LiteRT

Model: Inception V4

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: Inception V4ab5K10K15K20K25K21477.823265.4

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024ab80160240320400355.09328.471. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

Renaissance

Test: Apache Spark Bayes

OpenBenchmarking.orgms, Fewer Is BetterRenaissance 0.16Test: Apache Spark Bayesabc110220330440550490.0529.5500.3MIN: 459.29 / MAX: 580.9MIN: 458.39 / MAX: 562.09MIN: 460.66 / MAX: 542.36

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512ab2040608010070.7675.961. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024ab163248648070.8566.001. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

LiteRT

Model: Mobilenet Float

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: Mobilenet Floatab300600900120015001211.481295.51

Renaissance

Test: In-Memory Database Shootout

OpenBenchmarking.orgms, Fewer Is BetterRenaissance 0.16Test: In-Memory Database Shootoutabc70014002100280035003256.13081.53046.8MIN: 3019.89 / MAX: 3599.5MIN: 2836.52 / MAX: 3397.02MIN: 2814.66 / MAX: 3304.16

Renaissance

Test: Scala Dotty

OpenBenchmarking.orgms, Fewer Is BetterRenaissance 0.16Test: Scala Dottyabc100200300400500477.0447.0458.5MIN: 371.54 / MAX: 736.5MIN: 402.95 / MAX: 718.21MIN: 406.93 / MAX: 746.39

CP2K Molecular Dynamics

Input: H20-256

OpenBenchmarking.orgSeconds, Fewer Is BetterCP2K Molecular Dynamics 2024.3Input: H20-256abc140280420560700592.86629.56624.571. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm

Renaissance

Test: Random Forest

OpenBenchmarking.orgms, Fewer Is BetterRenaissance 0.16Test: Random Forestabc90180270360450414.4398.1420.8MIN: 322.79 / MAX: 466.1MIN: 343.09 / MAX: 475.62MIN: 316.29 / MAX: 556.39

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048ab142842567063.0959.841. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

Rustls

Benchmark: handshake - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384

OpenBenchmarking.orghandshakes/s, More Is BetterRustls 0.23.17Benchmark: handshake - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384ab90K180K270K360K450K423535.68402625.061. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs

simdjson

Throughput Test: LargeRandom

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 3.10Throughput Test: LargeRandomabc0.41180.82361.23541.64722.0591.831.811.741. (CXX) g++ options: -O3 -lrt

Gcrypt Library

OpenBenchmarking.orgSeconds, Fewer Is BetterGcrypt Library 1.10.3ab4080120160200162.13154.531. (CC) gcc options: -O2 -fvisibility=hidden

XNNPACK

Model: FP16MobileNetV2

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV2ab30060090012001500119012471. (CXX) g++ options: -O3 -lrt -lm

oneDNN

Harness: Deconvolution Batch shapes_1d - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Deconvolution Batch shapes_1d - Engine: CPUab0.70031.40062.10092.80123.50152.976123.11260MIN: 2.42MIN: 2.41. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

LiteRT

Model: Inception ResNet V2

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: Inception ResNet V2ab4K8K12K16K20K19530.220375.7

XNNPACK

Model: FP32MobileNetV2

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV2ab30060090012001500149515591. (CXX) g++ options: -O3 -lrt -lm

simdjson

Throughput Test: Kostya

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 3.10Throughput Test: Kostyaabc1.34332.68664.02995.37326.71655.975.935.731. (CXX) g++ options: -O3 -lrt

PyPerformance

Benchmark: asyncio_tcp_ssl

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: asyncio_tcp_sslab150300450600750645672

Llamafile

Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 16

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 16ab51015202519.0318.32

XNNPACK

Model: FP32MobileNetV3Large

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV3Largeab400800120016002000181018771. (CXX) g++ options: -O3 -lrt -lm

LiteRT

Model: SqueezeNet

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: SqueezeNetab4008001200160020001794.111860.35

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048ab153045607562.9765.271. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

Renaissance

Test: Genetic Algorithm Using Jenetics + Futures

OpenBenchmarking.orgms, Fewer Is BetterRenaissance 0.16Test: Genetic Algorithm Using Jenetics + Futuresabc160320480640800732.8744.3719.1MIN: 713.67 / MAX: 813.49MIN: 714.12 / MAX: 802.66MIN: 670.9 / MAX: 764.9

XNNPACK

Model: FP16MobileNetV3Large

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV3Largeab30060090012001500149815491. (CXX) g++ options: -O3 -lrt -lm

simdjson

Throughput Test: TopTweet

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 3.10Throughput Test: TopTweetabc369121510.4610.8010.791. (CXX) g++ options: -O3 -lrt

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128ab112233445547.7246.281. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

XNNPACK

Model: FP32MobileNetV1

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV1ab30060090012001500125212901. (CXX) g++ options: -O3 -lrt -lm

Renaissance

Test: Gaussian Mixture Model

OpenBenchmarking.orgms, Fewer Is BetterRenaissance 0.16Test: Gaussian Mixture Modelabc70014002100280035003399.53494.83472.4MIN: 2471.52MIN: 2520.23MIN: 2469.6

XNNPACK

Model: FP16MobileNetV1

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV1ab30060090012001500114311741. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP32MobileNetV3Small

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV3Smallab200400600800100097910051. (CXX) g++ options: -O3 -lrt -lm

Renaissance

Test: Savina Reactors.IO

OpenBenchmarking.orgms, Fewer Is BetterRenaissance 0.16Test: Savina Reactors.IOabc80016002400320040003506.43594.33567.8MIN: 3506.38 / MAX: 4329.37MIN: 3594.26 / MAX: 4599.09MAX: 5162.74

Renaissance

Test: Akka Unbalanced Cobwebbed Tree

OpenBenchmarking.orgms, Fewer Is BetterRenaissance 0.16Test: Akka Unbalanced Cobwebbed Treeabc100020003000400050004403.84439.94331.7MAX: 5719.11MAX: 5696.46MIN: 4331.69 / MAX: 5601.8

SVT-AV1

Encoder Mode: Preset 8 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.3Encoder Mode: Preset 8 - Input: Bosphorus 1080pabc70140210280350339.02330.87338.651. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 8 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.3Encoder Mode: Preset 8 - Input: Bosphorus 4Kabc20406080100102.0199.55100.891. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

oneDNN

Harness: IP Shapes 3D - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: IP Shapes 3D - Engine: CPUab0.93531.87062.80593.74124.67654.058004.15682MIN: 3.75MIN: 3.751. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

Renaissance

Test: Finagle HTTP Requests

OpenBenchmarking.orgms, Fewer Is BetterRenaissance 0.16Test: Finagle HTTP Requestsabc50010001500200025002319.42264.72296.6MIN: 1832.84MIN: 1788.41 / MAX: 2264.71MIN: 1805.17

oneDNN

Harness: IP Shapes 1D - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: IP Shapes 1D - Engine: CPUab0.25940.51880.77821.03761.2971.125731.15274MIN: 1.03MIN: 1.031. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048ab60120180240300279.04285.711. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

CP2K Molecular Dynamics

Input: H20-64

OpenBenchmarking.orgSeconds, Fewer Is BetterCP2K Molecular Dynamics 2024.3Input: H20-64abc132639526558.1957.3558.651. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm

oneDNN

Harness: Convolution Batch Shapes Auto - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Convolution Batch Shapes Auto - Engine: CPUab2468106.672876.81754MIN: 6.2MIN: 6.21. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

x265

Video Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is Betterx265Video Input: Bosphorus 4Kabc81624324032.5732.0432.731. x265 [info]: HEVC encoder version 3.5+1-f0c1022b6

SVT-AV1

Encoder Mode: Preset 13 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.3Encoder Mode: Preset 13 - Input: Bosphorus 1080pabc2004006008001000842.56824.81838.171. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 5 - Input: Beauty 4K 10-bit

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.3Encoder Mode: Preset 5 - Input: Beauty 4K 10-bitabc2468106.5046.3716.3741. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

Timed Eigen Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Eigen Compilation 3.4.0Time To Compileab132639526558.6659.87

oneDNN

Harness: Deconvolution Batch shapes_3d - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Deconvolution Batch shapes_3d - Engine: CPUab0.55411.10821.66232.21642.77052.412942.46279MIN: 2.34MIN: 2.351. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

Rustls

Benchmark: handshake-resume - Suite: TLS13_CHACHA20_POLY1305_SHA256

OpenBenchmarking.orghandshakes/s, More Is BetterRustls 0.23.17Benchmark: handshake-resume - Suite: TLS13_CHACHA20_POLY1305_SHA256ab80K160K240K320K400K388077.69380493.861. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs

Whisper.cpp

Model: ggml-small.en - Input: 2016 State of the Union

OpenBenchmarking.orgSeconds, Fewer Is BetterWhisper.cpp 1.6.2Model: ggml-small.en - Input: 2016 State of the Unionab50100150200250245.08240.601. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512vnni

Rustls

Benchmark: handshake-ticket - Suite: TLS13_CHACHA20_POLY1305_SHA256

OpenBenchmarking.orghandshakes/s, More Is BetterRustls 0.23.17Benchmark: handshake-ticket - Suite: TLS13_CHACHA20_POLY1305_SHA256ab90K180K270K360K450K404263.45397022.401. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: bertsquad-12 - Device: CPU - Executor: Standardab4812162015.5915.321. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

Llamafile

Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 128ab61218243026.2825.83

Rustls

Benchmark: handshake - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256

OpenBenchmarking.orghandshakes/s, More Is BetterRustls 0.23.17Benchmark: handshake - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256ab20K40K60K80K100K80462.679085.81. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs

Rustls

Benchmark: handshake-resume - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256

OpenBenchmarking.orghandshakes/s, More Is BetterRustls 0.23.17Benchmark: handshake-resume - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256ab800K1600K2400K3200K4000K3563852.573504511.311. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs

Stockfish

Chess Benchmark

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfishChess Benchmarkab10M20M30M40M50M46507038457517471. Stockfish 16 by the Stockfish developers (see AUTHORS file)

POV-Ray

Trace Time

OpenBenchmarking.orgSeconds, Fewer Is BetterPOV-RayTrace Timeab51015202518.5418.851. POV-Ray 3.7.0.10.unofficial

Llamafile

Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 128ab369121510.4710.64

Llamafile

Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 128ab51015202520.1319.81

Renaissance

Test: ALS Movie Lens

OpenBenchmarking.orgms, Fewer Is BetterRenaissance 0.16Test: ALS Movie Lensabc2K4K6K8K10K9805.79958.39907.4MIN: 9253.4 / MAX: 10057.61MIN: 9305.94 / MAX: 10040.58MIN: 9393.64 / MAX: 10087.8

SVT-AV1

Encoder Mode: Preset 13 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.3Encoder Mode: Preset 13 - Input: Bosphorus 4Kabc50100150200250212.52209.77212.951. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

oneDNN

Harness: Recurrent Neural Network Inference - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Recurrent Neural Network Inference - Engine: CPUab150300450600750700.86711.43MIN: 679.89MIN: 684.031. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: Standardab20406080100102.33103.861. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

x265

Video Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is Betterx265Video Input: Bosphorus 1080pabc306090120150114.45112.85114.521. x265 [info]: HEVC encoder version 3.5+1-f0c1022b6

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: Standardab0.72381.44762.17142.89523.6193.21673.17051. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

simdjson

Throughput Test: PartialTweets

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 3.10Throughput Test: PartialTweetsabc36912159.769.829.681. (CXX) g++ options: -O3 -lrt

Whisperfile

Model Size: Small

OpenBenchmarking.orgSeconds, Fewer Is BetterWhisperfile 20Aug24Model Size: Smallab4080120160200195.42192.68

ONNX Runtime

Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standardab112233445547.0747.731. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

NAMD

Input: ATPase with 327,506 Atoms

OpenBenchmarking.orgns/day, More Is BetterNAMD 3.0Input: ATPase with 327,506 Atomsabc0.63661.27321.90982.54643.1832.796322.790252.82925

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standardab0.34690.69381.04071.38761.73451.541961.521091. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

Apache CouchDB

Bulk Size: 100 - Inserts: 3000 - Rounds: 30

OpenBenchmarking.orgSeconds, Fewer Is BetterApache CouchDB 3.4.1Bulk Size: 100 - Inserts: 3000 - Rounds: 30ab50100150200250232.19235.351. (CXX) g++ options: -flto -lstdc++ -shared -lei

Numpy Benchmark

OpenBenchmarking.orgScore, More Is BetterNumpy Benchmarkab2004006008001000775.75765.35

PyPerformance

Benchmark: chaos

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: chaosab91827364538.238.7

ACES DGEMM

Sustained Floating-Point Rate

OpenBenchmarking.orgGFLOP/s, More Is BetterACES DGEMM 1.0Sustained Floating-Point Rateabc20040060080010001141.191137.391127.271. (CC) gcc options: -ffast-math -mavx2 -O3 -fopenmp -lopenblas

SVT-AV1

Encoder Mode: Preset 5 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.3Encoder Mode: Preset 5 - Input: Bosphorus 1080pabc20406080100101.97100.89102.111. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

PyPerformance

Benchmark: float

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: floatab112233445550.750.1

XNNPACK

Model: FP16MobileNetV3Small

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV3Smallab20040060080010009209311. (CXX) g++ options: -O3 -lrt -lm

Rustls

Benchmark: handshake-ticket - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256

OpenBenchmarking.orghandshakes/s, More Is BetterRustls 0.23.17Benchmark: handshake-ticket - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256ab600K1200K1800K2400K3000K2620332.002589637.921. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs

XNNPACK

Model: QS8MobileNetV2

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: QS8MobileNetV2ab20040060080010008448541. (CXX) g++ options: -O3 -lrt -lm

Whisperfile

Model Size: Tiny

OpenBenchmarking.orgSeconds, Fewer Is BetterWhisperfile 20Aug24Model Size: Tinyab102030405041.7142.20

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ArcFace ResNet-100 - Device: CPU - Executor: Standardab102030405042.4541.961. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

Renaissance

Test: Apache Spark PageRank

OpenBenchmarking.orgms, Fewer Is BetterRenaissance 0.16Test: Apache Spark PageRankabc50010001500200025002412.22439.92439.2MIN: 1691.04MIN: 1684.02 / MAX: 2439.95MIN: 1679.36 / MAX: 2439.21

SVT-AV1

Encoder Mode: Preset 8 - Input: Beauty 4K 10-bit

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.3Encoder Mode: Preset 8 - Input: Beauty 4K 10-bitabc369121512.4712.6112.601. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

PyPerformance

Benchmark: raytrace

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: raytraceab4080120160200175177

Rustls

Benchmark: handshake-ticket - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384

OpenBenchmarking.orghandshakes/s, More Is BetterRustls 0.23.17Benchmark: handshake-ticket - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384ab300K600K900K1200K1500K1553632.141536355.901. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs

FinanceBench

Benchmark: Bonds OpenMP

OpenBenchmarking.orgms, Fewer Is BetterFinanceBench 2016-07-25Benchmark: Bonds OpenMPab7K14K21K28K35K33061.2233432.641. (CXX) g++ options: -O3 -march=native -fopenmp

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: Standardab140280420560700636.32629.661. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

Apache CouchDB

Bulk Size: 500 - Inserts: 3000 - Rounds: 30

OpenBenchmarking.orgSeconds, Fewer Is BetterApache CouchDB 3.4.1Bulk Size: 500 - Inserts: 3000 - Rounds: 30ab110220330440550511.78517.151. (CXX) g++ options: -flto -lstdc++ -shared -lei

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standardab80160240320400390.60386.581. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

PyPerformance

Benchmark: go

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: goab2040608010077.877.0

SVT-AV1

Encoder Mode: Preset 3 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.3Encoder Mode: Preset 3 - Input: Bosphorus 4Kabc36912159.5909.5549.4951. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

Apache CouchDB

Bulk Size: 300 - Inserts: 1000 - Rounds: 30

OpenBenchmarking.orgSeconds, Fewer Is BetterApache CouchDB 3.4.1Bulk Size: 300 - Inserts: 1000 - Rounds: 30ab20406080100106.13107.181. (CXX) g++ options: -flto -lstdc++ -shared -lei

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512ab70140210280350327.30324.211. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

Blender

Blend File: Junkshop - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Junkshop - Compute: CPU-Onlyab163248648073.5674.26

SVT-AV1

Encoder Mode: Preset 5 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.3Encoder Mode: Preset 5 - Input: Bosphorus 4Kabc81624324034.5434.2334.451. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

OSPRay

Benchmark: gravity_spheres_volume/dim_512/ao/real_time

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 3.2Benchmark: gravity_spheres_volume/dim_512/ao/real_timeabc2468107.639447.574087.64282

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: Standardab369121511.0610.961. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

7-Zip Compression

Test: Decompression Rating

OpenBenchmarking.orgMIPS, More Is Better7-Zip CompressionTest: Decompression Ratingabc40K80K120K160K200K1659161668431673211. 7-Zip 23.01 (x64) : Copyright (c) 1999-2023 Igor Pavlov : 2023-06-20

oneDNN

Harness: Recurrent Neural Network Training - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Recurrent Neural Network Training - Engine: CPUab300600900120015001372.031383.64MIN: 1342.06MIN: 1333.571. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

OpenVINO GenAI

Model: Gemma-7b-int4-ov - Device: CPU

OpenBenchmarking.orgtokens/s, More Is BetterOpenVINO GenAI 2024.5Model: Gemma-7b-int4-ov - Device: CPUab36912159.839.91

SVT-AV1

Encoder Mode: Preset 13 - Input: Beauty 4K 10-bit

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.3Encoder Mode: Preset 13 - Input: Beauty 4K 10-bitabc51015202518.5918.4418.561. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

OSPRay

Benchmark: gravity_spheres_volume/dim_512/scivis/real_time

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 3.2Benchmark: gravity_spheres_volume/dim_512/scivis/real_timeabc2468107.587897.528757.55791

SVT-AV1

Encoder Mode: Preset 3 - Input: Beauty 4K 10-bit

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.3Encoder Mode: Preset 3 - Input: Beauty 4K 10-bitabc0.320.640.961.281.61.4221.4151.4111. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

GROMACS

Input: water_GMX50_bare

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACSInput: water_GMX50_bareab0.38070.76141.14211.52281.90351.6921.6791. GROMACS version: 2023.3-Ubuntu_2023.3_1ubuntu3

simdjson

Throughput Test: DistinctUserID

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 3.10Throughput Test: DistinctUserIDabc369121510.4610.3810.431. (CXX) g++ options: -O3 -lrt

Blender

Blend File: Classroom - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Classroom - Compute: CPU-Onlyab306090120150143.36144.41

PyPerformance

Benchmark: pathlib

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: pathlibab4812162014.214.3

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128ab2468107.247.191. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

ASTC Encoder

Preset: Exhaustive

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 5.0Preset: Exhaustiveab0.3790.7581.1371.5161.8951.68441.67281. (CXX) g++ options: -O3 -flto -pthread

ASTC Encoder

Preset: Thorough

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 5.0Preset: Thoroughab51015202520.3020.161. (CXX) g++ options: -O3 -flto -pthread

Etcpak

Benchmark: Multi-Threaded - Configuration: ETC2

OpenBenchmarking.orgMpx/s, More Is BetterEtcpak 2.0Benchmark: Multi-Threaded - Configuration: ETC2abc120240360480600577.82575.02573.911. (CXX) g++ options: -flto -pthread

PyPerformance

Benchmark: nbody

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: nbodyab132639526559.059.4

SVT-AV1

Encoder Mode: Preset 3 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.3Encoder Mode: Preset 3 - Input: Bosphorus 1080pabc71421283529.5729.3829.471. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024ab153045607569.2668.801. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

Apache CouchDB

Bulk Size: 500 - Inserts: 1000 - Rounds: 30

OpenBenchmarking.orgSeconds, Fewer Is BetterApache CouchDB 3.4.1Bulk Size: 500 - Inserts: 1000 - Rounds: 30ab306090120150148.05149.031. (CXX) g++ options: -flto -lstdc++ -shared -lei

ASTC Encoder

Preset: Medium

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 5.0Preset: Mediumab306090120150156.22155.271. (CXX) g++ options: -O3 -flto -pthread

Blender

Blend File: Barbershop - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Barbershop - Compute: CPU-Onlyab110220330440550506.2509.3

PyPerformance

Benchmark: pickle_pure_python

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: pickle_pure_pythonab4080120160200165166

ASTC Encoder

Preset: Very Thorough

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 5.0Preset: Very Thoroughab0.61671.23341.85012.46683.08352.74102.72481. (CXX) g++ options: -O3 -flto -pthread

PyPerformance

Benchmark: gc_collect

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: gc_collectab150300450600750677681

OSPRay

Benchmark: particle_volume/scivis/real_time

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 3.2Benchmark: particle_volume/scivis/real_timeabc36912158.984868.932458.97005

Primesieve

Length: 1e13

OpenBenchmarking.orgSeconds, Fewer Is BetterPrimesieve 12.6Length: 1e13ab2040608010078.5078.951. (CXX) g++ options: -O3

PyPerformance

Benchmark: regex_compile

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: regex_compileab163248648069.870.2

Llamafile

Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 16

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 16ab0.40280.80561.20841.61122.0141.781.79

OSPRay

Benchmark: particle_volume/pathtracer/real_time

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 3.2Benchmark: particle_volume/pathtracer/real_timeabc50100150200250236.25235.33234.97

PyPerformance

Benchmark: async_tree_io

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: async_tree_ioab160320480640800755759

Llamafile

Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 128ab0.450.91.351.82.251.992.00

Blender

Blend File: Fishy Cat - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Fishy Cat - Compute: CPU-Onlyab163248648071.3571.70

Primesieve

Length: 1e12

OpenBenchmarking.orgSeconds, Fewer Is BetterPrimesieve 12.6Length: 1e12ab2468106.3476.3781. (CXX) g++ options: -O3

Rustls

Benchmark: handshake - Suite: TLS13_CHACHA20_POLY1305_SHA256

OpenBenchmarking.orghandshakes/s, More Is BetterRustls 0.23.17Benchmark: handshake - Suite: TLS13_CHACHA20_POLY1305_SHA256ab16K32K48K64K80K76454.4576083.731. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs

FinanceBench

Benchmark: Repo OpenMP

OpenBenchmarking.orgms, Fewer Is BetterFinanceBench 2016-07-25Benchmark: Repo OpenMPab5K10K15K20K25K21418.4521522.071. (CXX) g++ options: -O3 -march=native -fopenmp

PyPerformance

Benchmark: django_template

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: django_templateab51015202520.720.8

OSPRay

Benchmark: particle_volume/ao/real_time

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 3.2Benchmark: particle_volume/ao/real_timeabc36912159.009178.966328.98586

BYTE Unix Benchmark

Computational Test: Dhrystone 2

OpenBenchmarking.orgLPS, More Is BetterBYTE Unix Benchmark 5.1.3-gitComputational Test: Dhrystone 2abc400M800M1200M1600M2000M1866536062.71857795366.11862548305.41. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm

QuantLib

Size: XXS

OpenBenchmarking.orgtasks/s, More Is BetterQuantLib 1.35-devSize: XXSabc369121513.4313.4313.491. (CXX) g++ options: -O3 -march=native -fPIE -pie

Y-Cruncher

Pi Digits To Calculate: 1B

OpenBenchmarking.orgSeconds, Fewer Is BetterY-Cruncher 0.8.5Pi Digits To Calculate: 1Bab51015202518.4918.40

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128ab2468106.886.851. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

OpenVINO GenAI

Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU

OpenBenchmarking.orgtokens/s, More Is BetterOpenVINO GenAI 2024.5Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPUab51015202519.2819.20

Llamafile

Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 16

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 16ab61218243024.5924.69

Whisperfile

Model Size: Medium

OpenBenchmarking.orgSeconds, Fewer Is BetterWhisperfile 20Aug24Model Size: Mediumab120240360480600534.92532.81

BYTE Unix Benchmark

Computational Test: Pipe

OpenBenchmarking.orgLPS, More Is BetterBYTE Unix Benchmark 5.1.3-gitComputational Test: Pipeabc10M20M30M40M50M48806257.148718087.148613927.91. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm

Llamafile

Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 16

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 16ab369121510.2210.26

Blender

Blend File: BMW27 - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: BMW27 - Compute: CPU-Onlyab122436486053.5553.75

OpenSSL

Algorithm: AES-128-GCM

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSLAlgorithm: AES-128-GCMab20000M40000M60000M80000M100000M1047845221701044043478401. OpenSSL 3.0.13 30 Jan 2024 (Library: OpenSSL 3.0.13 30 Jan 2024) - Additional Parameters: -engine qatengine -async_jobs 8

OpenSSL

Algorithm: AES-256-GCM

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSLAlgorithm: AES-256-GCMab20000M40000M60000M80000M100000M97172751700968217370601. OpenSSL 3.0.13 30 Jan 2024 (Library: OpenSSL 3.0.13 30 Jan 2024) - Additional Parameters: -engine qatengine -async_jobs 8

PyPerformance

Benchmark: python_startup

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: python_startupab1.30282.60563.90845.21126.5145.775.79

OSPRay

Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 3.2Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_timeabc2468108.820938.790968.81199

Whisper.cpp

Model: ggml-medium.en - Input: 2016 State of the Union

OpenBenchmarking.orgSeconds, Fewer Is BetterWhisper.cpp 1.6.2Model: ggml-medium.en - Input: 2016 State of the Unionab150300450600750700.91703.221. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512vnni

PyPerformance

Benchmark: asyncio_websockets

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: asyncio_websocketsab70140210280350315316

OpenVINO GenAI

Model: Falcon-7b-instruct-int4-ov - Device: CPU

OpenBenchmarking.orgtokens/s, More Is BetterOpenVINO GenAI 2024.5Model: Falcon-7b-instruct-int4-ov - Device: CPUab369121512.9312.97

QuantLib

Size: S

OpenBenchmarking.orgtasks/s, More Is BetterQuantLib 1.35-devSize: Sabc369121512.7512.7112.721. (CXX) g++ options: -O3 -march=native -fPIE -pie

PyPerformance

Benchmark: xml_etree

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: xml_etreeab81624324035.835.7

7-Zip Compression

Test: Compression Rating

OpenBenchmarking.orgMIPS, More Is Better7-Zip CompressionTest: Compression Ratingabc40K80K120K160K200K1638591640501643131. 7-Zip 23.01 (x64) : Copyright (c) 1999-2023 Igor Pavlov : 2023-06-20

Apache CouchDB

Bulk Size: 100 - Inserts: 1000 - Rounds: 30

OpenBenchmarking.orgSeconds, Fewer Is BetterApache CouchDB 3.4.1Bulk Size: 100 - Inserts: 1000 - Rounds: 30ab163248648069.9370.111. (CXX) g++ options: -flto -lstdc++ -shared -lei

Build2

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterBuild2 0.17Time To Compileab2040608010092.0592.29

BYTE Unix Benchmark

Computational Test: System Call

OpenBenchmarking.orgLPS, More Is BetterBYTE Unix Benchmark 5.1.3-gitComputational Test: System Callabc11M22M33M44M55M49140426.649062324.149016743.61. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm

Y-Cruncher

Pi Digits To Calculate: 500M

OpenBenchmarking.orgSeconds, Fewer Is BetterY-Cruncher 0.8.5Pi Digits To Calculate: 500Mab2468108.7728.794

Whisper.cpp

Model: ggml-base.en - Input: 2016 State of the Union

OpenBenchmarking.orgSeconds, Fewer Is BetterWhisper.cpp 1.6.2Model: ggml-base.en - Input: 2016 State of the Unionab2040608010087.4987.271. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512vnni

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: Standardab306090120150156.45156.831. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

PyPerformance

Benchmark: crypto_pyaes

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: crypto_pyaesab102030405041.741.8

NAMD

Input: STMV with 1,066,628 Atoms

OpenBenchmarking.orgns/day, More Is BetterNAMD 3.0Input: STMV with 1,066,628 Atomsabc0.17060.34120.51180.68240.8530.756560.756340.75813

Apache CouchDB

Bulk Size: 300 - Inserts: 3000 - Rounds: 30

OpenBenchmarking.orgSeconds, Fewer Is BetterApache CouchDB 3.4.1Bulk Size: 300 - Inserts: 3000 - Rounds: 30ab80160240320400367.83368.661. (CXX) g++ options: -flto -lstdc++ -shared -lei

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: GPT-2 - Device: CPU - Executor: Standardab306090120150134.60134.311. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenSSL

Algorithm: ChaCha20-Poly1305

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSLAlgorithm: ChaCha20-Poly1305ab20000M40000M60000M80000M100000M92393529340922163505801. OpenSSL 3.0.13 30 Jan 2024 (Library: OpenSSL 3.0.13 30 Jan 2024) - Additional Parameters: -engine qatengine -async_jobs 8

OpenSSL

Algorithm: ChaCha20

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSLAlgorithm: ChaCha20ab30000M60000M90000M120000M150000M1305884950501303598841901. OpenSSL 3.0.13 30 Jan 2024 (Library: OpenSSL 3.0.13 30 Jan 2024) - Additional Parameters: -engine qatengine -async_jobs 8

BYTE Unix Benchmark

Computational Test: Whetstone Double

OpenBenchmarking.orgMWIPS, More Is BetterBYTE Unix Benchmark 5.1.3-gitComputational Test: Whetstone Doubleabc70K140K210K280K350K343491.9343113.0343187.01. (CC) gcc options: -pedantic -O3 -ffast-math -march=native -mtune=native -lm

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: Standardab306090120150141.12141.001. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

Blender

Blend File: Pabellon Barcelona - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Pabellon Barcelona - Compute: CPU-Onlyab4080120160200166.12166.25

ASTC Encoder

Preset: Fast

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 5.0Preset: Fastab90180270360450396.65396.431. (CXX) g++ options: -O3 -flto -pthread

Rustls

Benchmark: handshake-resume - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384

OpenBenchmarking.orghandshakes/s, More Is BetterRustls 0.23.17Benchmark: handshake-resume - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384ab400K800K1200K1600K2000K1820810.211821261.881. (CC) gcc options: -m64 -lgcc_s -lutil -lrt -lpthread -lm -ldl -lc -pie -nodefaultlibs

Apache Cassandra

Test: Writes

OpenBenchmarking.orgOp/s, More Is BetterApache Cassandra 5.0Test: Writesab60K120K180K240K300K271333271373

Llamafile

Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 2048ab3K6K9K12K15K1228812288

Llamafile

Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 1024ab1300260039005200650061446144

Llamafile

Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 512ab700140021002800350030723072

Llamafile

Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 256

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 256ab3006009001200150015361536

Llamafile

Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 2048ab7K14K21K28K35K3276832768

Llamafile

Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 1024ab4K8K12K16K20K1638416384

Llamafile

Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 512ab2K4K6K8K10K81928192

Llamafile

Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 256

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 256ab900180027003600450040964096

Llamafile

Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 2048ab7K14K21K28K35K3276832768

Llamafile

Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 1024ab4K8K12K16K20K1638416384

Llamafile

Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 512ab2K4K6K8K10K81928192

Llamafile

Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 256

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 256ab900180027003600450040964096

Llamafile

Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 2048ab7K14K21K28K35K3276832768

Llamafile

Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 1024ab4K8K12K16K20K1638416384

Llamafile

Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 512ab2K4K6K8K10K81928192

Llamafile

Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 256

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 256ab900180027003600450040964096

PyPerformance

Benchmark: json_loads

OpenBenchmarking.orgMilliseconds, Fewer Is BetterPyPerformance 1.11Benchmark: json_loadsab369121512.112.1

OpenVINO GenAI

Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time Per Output Token

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO GenAI 2024.5Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time Per Output Tokenab122436486051.8652.10

OpenVINO GenAI

Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time To First Token

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO GenAI 2024.5Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time To First Tokenab132639526555.9356.26

OpenVINO GenAI

Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time Per Output Token

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO GenAI 2024.5Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time Per Output Tokenab2040608010077.3477.13

OpenVINO GenAI

Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time To First Token

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO GenAI 2024.5Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time To First Tokenab2040608010086.0684.39

OpenVINO GenAI

Model: Gemma-7b-int4-ov - Device: CPU - Time Per Output Token

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO GenAI 2024.5Model: Gemma-7b-int4-ov - Device: CPU - Time Per Output Tokenab20406080100101.72100.94

OpenVINO GenAI

Model: Gemma-7b-int4-ov - Device: CPU - Time To First Token

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO GenAI 2024.5Model: Gemma-7b-int4-ov - Device: CPU - Time To First Tokenab20406080100106.62107.03

ONNX Runtime

Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standardab51015202521.2420.951. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standardab140280420560700648.52657.421. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: Standardab2468107.086017.091721. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standardab0.58181.16361.74542.32722.9092.558982.585891. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ArcFace ResNet-100 - Device: CPU - Executor: Standardab61218243023.5523.831. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: Standardab70140210280350310.88315.411. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: Standardab0.35720.71441.07161.42881.7861.570841.587461. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: bertsquad-12 - Device: CPU - Executor: Standardab153045607564.1465.281. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: Standardab2468106.391126.375661. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: Standardab36912159.769859.625901. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: Standardab2040608010090.4591.241. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: GPT-2 - Device: CPU - Executor: Standardab2468107.427767.443401. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt


Phoronix Test Suite v10.8.5