genoa tests eoy2024

Benchmarks for a future article. 2 x AMD EPYC 9124 16-Core testing with a AMD Titanite_4G (RTI1007B BIOS) and ASPEED on Ubuntu 23.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2412273-NE-GENOATEST41&rdt&grs.

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024

ONNX Runtime

Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Standard

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024

LiteRT

Model: NASNet Mobile

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Standard

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024

SVT-AV1

Encoder Mode: Preset 13 - Input: Bosphorus 4K

Renaissance

Test: Random Forest

Renaissance

Test: Savina Reactors.IO

WebP Image Encode

Encode Settings: Default

Renaissance

Test: Scala Dotty

XNNPACK

Model: FP16MobileNetV3Large

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Standard

Cpuminer-Opt

Algorithm: Garlicoin

XNNPACK

Model: FP16MobileNetV3Small

SVT-AV1

Encoder Mode: Preset 13 - Input: Bosphorus 1080p

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard

Renaissance

Test: Akka Unbalanced Cobwebbed Tree

Stockfish

Chess Benchmark

Renaissance

Test: In-Memory Database Shootout

Renaissance

Test: Apache Spark Bayes

7-Zip Compression

Test: Decompression Rating

x265

Video Input: Bosphorus 4K

Rustls

Benchmark: handshake - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384

OpenVINO GenAI

Model: Falcon-7b-instruct-int4-ov - Device: CPU

Timed Eigen Compilation

Time To Compile

XNNPACK

Model: FP32MobileNetV3Large

srsRAN Project

Test: PDSCH Processor Benchmark, Throughput Total

VVenC

Video Input: Bosphorus 4K - Video Preset: Faster

XNNPACK

Model: FP16MobileNetV2

XNNPACK

Model: FP32MobileNetV2

Whisper.cpp

Model: ggml-base.en - Input: 2016 State of the Union

VVenC

Video Input: Bosphorus 1080p - Video Preset: Fast

XNNPACK

Model: FP16MobileNetV1

x265

Video Input: Bosphorus 1080p

Whisperfile

Model Size: Small

Whisperfile

Model Size: Tiny

LiteRT

Model: Mobilenet Quant

Cpuminer-Opt

Algorithm: x20r

XNNPACK

Model: FP32MobileNetV1

VVenC

Video Input: Bosphorus 1080p - Video Preset: Faster

Renaissance

Test: ALS Movie Lens

simdjson

Throughput Test: Kostya

LiteRT

Model: Mobilenet Float

oneDNN

Harness: Recurrent Neural Network Training - Engine: CPU

XNNPACK

Model: QS8MobileNetV2

oneDNN

Harness: Deconvolution Batch shapes_1d - Engine: CPU

7-Zip Compression

Test: Compression Rating

LiteRT

Model: DeepLab V3

uvg266

Video Input: Bosphorus 1080p - Video Preset: Very Fast

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128

LiteRT

Model: Inception V4

uvg266

Video Input: Bosphorus 1080p - Video Preset: Ultra Fast

simdjson

Throughput Test: DistinctUserID

Y-Cruncher

Pi Digits To Calculate: 5B

WarpX

Input: Plasma Acceleration

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Standard

LiteRT

Model: SqueezeNet

x265

Video Input: Bosphorus 4K

uvg266

Video Input: Bosphorus 1080p - Video Preset: Super Fast

Renaissance

Test: Genetic Algorithm Using Jenetics + Futures

Build2

Time To Compile

XNNPACK

Model: FP32MobileNetV3Small

Rustls

Benchmark: handshake-resume - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384

simdjson

Throughput Test: PartialTweets

WebP Image Encode

Encode Settings: Quality 100

Blender

Blend File: BMW27 - Compute: CPU-Only

LiteRT

Model: Quantized COCO SSD MobileNet v1

OSPRay

Benchmark: particle_volume/ao/real_time

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Standard

Stress-NG

Test: Context Switching

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard

Palabos

Grid Size: 500

OSPRay

Benchmark: gravity_spheres_volume/dim_512/scivis/real_time

simdjson

Throughput Test: TopTweet

Stress-NG

Test: CPU Stress

oneDNN

Harness: IP Shapes 3D - Engine: CPU

Palabos

Grid Size: 400

simdjson

Throughput Test: LargeRandom

Timed PHP Compilation

Time To Compile

NAMD

Input: ATPase with 327,506 Atoms

OpenVINO GenAI

Model: Gemma-7b-int4-ov - Device: CPU

uvg266

Video Input: Bosphorus 4K - Video Preset: Super Fast

VVenC

Video Input: Bosphorus 4K - Video Preset: Fast

Renaissance

Test: Apache Spark PageRank

SVT-AV1

Encoder Mode: Preset 5 - Input: Bosphorus 4K

Y-Cruncher

Pi Digits To Calculate: 1B

WebP Image Encode

Encode Settings: Quality 100, Lossless

Whisperfile

Model Size: Medium

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard

GROMACS

Input: water_GMX50_bare

Rustls

Benchmark: handshake-ticket - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256

Renaissance

Test: Gaussian Mixture Model

Z3 Theorem Prover

SMT File: 2.smt2

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Standard

Cpuminer-Opt

Algorithm: Magi

OSPRay

Benchmark: gravity_spheres_volume/dim_512/ao/real_time

uvg266

Video Input: Bosphorus 4K - Video Preset: Ultra Fast

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Standard

LZ4 Compression

Compression Level: 12 - Decompression Speed

Z3 Theorem Prover

SMT File: 1.smt2

ACES DGEMM

Sustained Floating-Point Rate

CP2K Molecular Dynamics

Input: H20-256

oneDNN

Harness: Convolution Batch Shapes Auto - Engine: CPU

Whisper.cpp

Model: ggml-medium.en - Input: 2016 State of the Union

Whisper.cpp

Model: ggml-small.en - Input: 2016 State of the Union

OSPRay

Benchmark: particle_volume/pathtracer/real_time

LZ4 Compression

Compression Level: 12 - Compression Speed

BYTE Unix Benchmark

Computational Test: Dhrystone 2

uvg266

Video Input: Bosphorus 1080p - Video Preset: Slow

SVT-AV1

Encoder Mode: Preset 5 - Input: Bosphorus 1080p

Y-Cruncher

Pi Digits To Calculate: 500M

Blender

Blend File: Pabellon Barcelona - Compute: CPU-Only

oneDNN

Harness: Deconvolution Batch shapes_3d - Engine: CPU

SVT-AV1

Encoder Mode: Preset 3 - Input: Bosphorus 4K

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU

SVT-AV1

Encoder Mode: Preset 8 - Input: Bosphorus 1080p

Cpuminer-Opt

Algorithm: Ringcoin

LZ4 Compression

Compression Level: 9 - Compression Speed

OpenVINO

Model: Person Re-Identification Retail FP16 - Device: CPU

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128

Rustls

Benchmark: handshake-resume - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256

OpenVINO

Model: Person Vehicle Bike Detection FP16 - Device: CPU

OpenVINO GenAI

Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU

OpenVINO

Model: Road Segmentation ADAS FP16 - Device: CPU

Blender

Blend File: Fishy Cat - Compute: CPU-Only

uvg266

Video Input: Bosphorus 4K - Video Preset: Very Fast

LZ4 Compression

Compression Level: 2 - Compression Speed

OpenSSL

Algorithm: SHA256

uvg266

Video Input: Bosphorus 1080p - Video Preset: Medium

Timed Node.js Compilation

Time To Compile

OpenVINO

Model: Person Vehicle Bike Detection FP16 - Device: CPU

OpenVINO

Model: Vehicle Detection FP16 - Device: CPU

OpenVINO

Model: Handwritten English Recognition FP16 - Device: CPU

OpenVINO GenAI

Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU

OpenVINO

Model: Handwritten English Recognition FP16 - Device: CPU

OpenVINO

Model: Face Detection Retail FP16-INT8 - Device: CPU

LZ4 Compression

Compression Level: 9 - Decompression Speed

OSPRay

Benchmark: particle_volume/scivis/real_time

Blender

Blend File: Junkshop - Compute: CPU-Only

OpenSSL

Algorithm: AES-128-GCM

oneDNN

Harness: IP Shapes 1D - Engine: CPU

OpenVINO

Model: Vehicle Detection FP16 - Device: CPU

SVT-AV1

Encoder Mode: Preset 3 - Input: Bosphorus 1080p

OpenVINO

Model: Road Segmentation ADAS FP16-INT8 - Device: CPU

uvg266

Video Input: Bosphorus 4K - Video Preset: Medium

Palabos

Grid Size: 100

WarpX

Input: Uniform Plasma

OSPRay

Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time

x265

Video Input: Bosphorus 1080p

OpenVINO

Model: Road Segmentation ADAS FP16 - Device: CPU

LZ4 Compression

Compression Level: 3 - Decompression Speed

OpenVINO

Model: Handwritten English Recognition FP16-INT8 - Device: CPU

NAMD

Input: STMV with 1,066,628 Atoms

OpenVINO

Model: Person Detection FP16 - Device: CPU

OpenVINO

Model: Person Detection FP16 - Device: CPU

oneDNN

Harness: Recurrent Neural Network Inference - Engine: CPU

Laghos

Test: Triple Point Problem

OpenVINO

Model: Handwritten English Recognition FP16-INT8 - Device: CPU

OpenVINO

Model: Face Detection Retail FP16-INT8 - Device: CPU

Cpuminer-Opt

Algorithm: Triple SHA-256, Onecoin

OpenSSL

Algorithm: ChaCha20

OpenVINO

Model: Weld Porosity Detection FP16-INT8 - Device: CPU

OpenVINO

Model: Machine Translation EN To DE FP16 - Device: CPU

C-Ray

Resolution: 4K - Rays Per Pixel: 16

Blender

Blend File: Barbershop - Compute: CPU-Only

Rustls

Benchmark: handshake-ticket - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384

OpenVINO

Model: Machine Translation EN To DE FP16 - Device: CPU

ASTC Encoder

Preset: Thorough

Stress-NG

Test: Radix String Sort

OpenVINO

Model: Face Detection Retail FP16 - Device: CPU

OpenVINO

Model: Weld Porosity Detection FP16 - Device: CPU

QuantLib

Size: XXS

OpenVINO

Model: Noise Suppression Poconet-Like FP16 - Device: CPU

Stress-NG

Test: Socket Activity

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU

LZ4 Compression

Compression Level: 1 - Compression Speed

OpenVINO

Model: Weld Porosity Detection FP16-INT8 - Device: CPU

Blender

Blend File: Classroom - Compute: CPU-Only

Cpuminer-Opt

Algorithm: Deepcoin

OpenVINO

Model: Road Segmentation ADAS FP16-INT8 - Device: CPU

BYTE Unix Benchmark

Computational Test: Whetstone Double

uvg266

Video Input: Bosphorus 4K - Video Preset: Slow

OpenVINO

Model: Vehicle Detection FP16-INT8 - Device: CPU

Rustls

Benchmark: handshake - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256

OpenSSL

Algorithm: AES-256-GCM

LZ4 Compression

Compression Level: 2 - Decompression Speed

ASTC Encoder

Preset: Fast

C-Ray

Resolution: 1080p - Rays Per Pixel: 16

ASTC Encoder

Preset: Very Thorough

OpenVINO

Model: Face Detection FP16-INT8 - Device: CPU

OpenVINO

Model: Face Detection FP16 - Device: CPU

ASTC Encoder

Preset: Exhaustive

OpenVINO

Model: Person Re-Identification Retail FP16 - Device: CPU

OpenVINO

Model: Noise Suppression Poconet-Like FP16 - Device: CPU

srsRAN Project

Test: PUSCH Processor Benchmark, Throughput Total

OpenVINO

Model: Person Detection FP32 - Device: CPU

LZ4 Compression

Compression Level: 1 - Decompression Speed

SVT-AV1

Encoder Mode: Preset 8 - Input: Bosphorus 4K

Primesieve

Length: 1e12

Primesieve

Length: 1e13

OpenSSL

Algorithm: RSA4096

Cpuminer-Opt

Algorithm: scrypt

OpenSSL

Algorithm: AES-128-GCM

QuantLib

Size: S

Cpuminer-Opt

Algorithm: Quad SHA-256, Pyrite

BYTE Unix Benchmark

Computational Test: Pipe

LZ4 Compression

Compression Level: 3 - Compression Speed

LiteRT

Model: Inception ResNet V2

OpenVINO

Model: Person Detection FP32 - Device: CPU

OpenVINO

Model: Face Detection FP16 - Device: CPU

OpenSSL

Algorithm: SHA512

BYTE Unix Benchmark

Computational Test: System Call

Stress-NG

Test: Bitonic Integer Sort

Cpuminer-Opt

Algorithm: Blake-2 S

OpenSSL

Algorithm: ChaCha20-Poly1305

OpenSSL

Algorithm: SHA256

OpenSSL

Algorithm: ChaCha20-Poly1305

Renaissance

Test: Finagle HTTP Requests

OpenVINO

Model: Face Detection FP16-INT8 - Device: CPU

ASTC Encoder

Preset: Medium

OpenSSL

Algorithm: AES-256-GCM

OpenSSL

Algorithm: ChaCha20

C-Ray

Resolution: 5K - Rays Per Pixel: 16

Laghos

Test: Sedov Blast Wave, ube_922_hex.mesh

OpenSSL

Algorithm: SHA512

OpenSSL

Algorithm: RSA4096

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU

OpenVINO

Model: Weld Porosity Detection FP16 - Device: CPU

OpenVINO

Model: Vehicle Detection FP16-INT8 - Device: CPU

OpenVINO

Model: Face Detection Retail FP16 - Device: CPU

Cpuminer-Opt

Algorithm: LBC, LBRY Credits

Cpuminer-Opt

Algorithm: Myriad-Groestl

Cpuminer-Opt

Algorithm: Skeincoin

Intel Open Image Denoise

Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only

Intel Open Image Denoise

Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only

Intel Open Image Denoise

Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only

WebP Image Encode

Encode Settings: Quality 100, Lossless, Highest Compression

WebP Image Encode

Encode Settings: Quality 100, Highest Compression

OpenVINO GenAI

Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time Per Output Token

OpenVINO GenAI

Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time To First Token

OpenVINO GenAI

Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time Per Output Token

OpenVINO GenAI

Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time To First Token

OpenVINO GenAI

Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU - Time Per Output Token

OpenVINO GenAI

Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU - Time To First Token

OpenVINO GenAI

Model: Gemma-7b-int4-ov - Device: CPU - Time Per Output Token

OpenVINO GenAI

Model: Gemma-7b-int4-ov - Device: CPU - Time To First Token

ONNX Runtime

Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Standard

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Standard

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Standard

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Standard

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Standard

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Standard

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Standard

Phoronix Test Suite v10.8.5