phoronix-machine-learning.txt

AMD Ryzen Threadripper 7960X 24-Cores testing with a Gigabyte TRX50 AERO D (FA BIOS) and Sapphire AMD Radeon RX 7900 XTX 24GB on Ubuntu 24.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2411137-NE-PHORONIXM28&grr.

TensorFlow

Device: GPU - Batch Size: 512 - Model: VGG-16

TensorFlow

Device: GPU - Batch Size: 256 - Model: VGG-16

TensorFlow

Device: GPU - Batch Size: 512 - Model: ResNet-50

Scikit-Learn

Benchmark: Isotonic / Pathological

TensorFlow

Device: GPU - Batch Size: 256 - Model: ResNet-50

TensorFlow

Device: GPU - Batch Size: 64 - Model: VGG-16

Scikit-Learn

Benchmark: Isotonic / Perturbed Logarithm

TensorFlow

Device: GPU - Batch Size: 512 - Model: GoogLeNet

Scikit-Learn

Benchmark: Isotonic / Logistic

TensorFlow

Device: CPU - Batch Size: 512 - Model: VGG-16

TensorFlow

Device: CPU - Batch Size: 256 - Model: ResNet-50

TensorFlow

Device: GPU - Batch Size: 32 - Model: VGG-16

TensorFlow

Device: GPU - Batch Size: 512 - Model: AlexNet

LeelaChessZero

Backend: BLAS

Scikit-Learn

Benchmark: SAGA

TensorFlow

Device: GPU - Batch Size: 256 - Model: GoogLeNet

TensorFlow

Device: CPU - Batch Size: 512 - Model: ResNet-50

TensorFlow

Device: CPU - Batch Size: 256 - Model: VGG-16

PyTorch

Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_l

TensorFlow

Device: GPU - Batch Size: 64 - Model: ResNet-50

TensorFlow

Device: CPU - Batch Size: 512 - Model: GoogLeNet

TensorFlow

Device: GPU - Batch Size: 16 - Model: VGG-16

Scikit-Learn

Benchmark: Sparse Random Projections / 100 Iterations

Scikit-Learn

Benchmark: Hist Gradient Boosting Adult

Whisper.cpp

Model: ggml-medium.en - Input: 2016 State of the Union

TensorFlow

Device: GPU - Batch Size: 256 - Model: AlexNet

Scikit-Learn

Benchmark: Plot Parallel Pairwise

Scikit-Learn

Benchmark: Hist Gradient Boosting Higgs Boson

NCNN

Target: CPU - Model: FastestDet

NCNN

Target: CPU - Model: vision_transformer

NCNN

Target: CPU - Model: regnety_400m

NCNN

Target: CPU - Model: squeezenet_ssd

NCNN

Target: CPU - Model: yolov4-tiny

NCNN

Target: CPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3

NCNN

Target: CPU - Model: resnet50

NCNN

Target: CPU - Model: alexnet

NCNN

Target: CPU - Model: resnet18

NCNN

Target: CPU - Model: vgg16

NCNN

Target: CPU - Model: googlenet

NCNN

Target: CPU - Model: blazeface

NCNN

Target: CPU - Model: efficientnet-b0

NCNN

Target: CPU - Model: mnasnet

NCNN

Target: CPU - Model: shufflenet-v2

NCNN

Target: CPU-v3-v3 - Model: mobilenet-v3

NCNN

Target: CPU-v2-v2 - Model: mobilenet-v2

NCNN

Target: CPU - Model: mobilenet

Scikit-Learn

Benchmark: Covertype Dataset Benchmark

Scikit-Learn

Benchmark: Lasso

TensorFlow

Device: GPU - Batch Size: 32 - Model: ResNet-50

Scikit-Learn

Benchmark: SGDOneClassSVM

Scikit-Learn

Benchmark: TSNE MNIST Dataset

OpenVINO

Model: Noise Suppression Poconet-Like FP16 - Device: CPU

OpenVINO

Model: Noise Suppression Poconet-Like FP16 - Device: CPU

OpenVINO

Model: Person Detection FP16 - Device: CPU

OpenVINO

Model: Person Detection FP16 - Device: CPU

OpenVINO

Model: Person Detection FP32 - Device: CPU

OpenVINO

Model: Person Detection FP32 - Device: CPU

TensorFlow Lite

Model: Inception V4

TensorFlow Lite

Model: NASNet Mobile

TensorFlow Lite

Model: SqueezeNet

OpenVINO

Model: Road Segmentation ADAS FP16 - Device: CPU

OpenVINO

Model: Road Segmentation ADAS FP16 - Device: CPU

OpenVINO

Model: Vehicle Detection FP16 - Device: CPU

OpenVINO

Model: Vehicle Detection FP16 - Device: CPU

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel

Scikit-Learn

Benchmark: Isolation Forest

TensorFlow

Device: GPU - Batch Size: 64 - Model: GoogLeNet

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel

TensorFlow

Device: CPU - Batch Size: 64 - Model: VGG-16

OpenVINO

Model: Machine Translation EN To DE FP16 - Device: CPU

OpenVINO

Model: Machine Translation EN To DE FP16 - Device: CPU

Scikit-Learn

Benchmark: GLM

Scikit-Learn

Benchmark: Hist Gradient Boosting

Whisper.cpp

Model: ggml-small.en - Input: 2016 State of the Union

TensorFlow

Device: GPU - Batch Size: 16 - Model: ResNet-50

PyTorch

Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_l

PyTorch

Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l

PyTorch

Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_l

Mobile Neural Network

Model: inception-v3

Mobile Neural Network

Model: mobilenet-v1-1.0

Mobile Neural Network

Model: MobileNetV2_224

Mobile Neural Network

Model: SqueezeNetV1.0

Mobile Neural Network

Model: resnet-v2-50

Mobile Neural Network

Model: squeezenetv1.1

Mobile Neural Network

Model: mobilenetV3

Mobile Neural Network

Model: nasnet

PyTorch

Device: CPU - Batch Size: 256 - Model: Efficientnet_v2_l

Scikit-Learn

Benchmark: Plot Hierarchical

XNNPACK

Model: QS8MobileNetV2

XNNPACK

Model: FP16MobileNetV3Small

XNNPACK

Model: FP16MobileNetV3Large

XNNPACK

Model: FP16MobileNetV2

XNNPACK

Model: FP16MobileNetV1

XNNPACK

Model: FP32MobileNetV3Small

XNNPACK

Model: FP32MobileNetV3Large

XNNPACK

Model: FP32MobileNetV2

XNNPACK

Model: FP32MobileNetV1

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: S3D

OpenCV

Test: DNN - Deep Neural Network

Scikit-Learn

Benchmark: Hist Gradient Boosting Categorical Only

Scikit-Learn

Benchmark: Plot Neighbors

TensorFlow

Device: GPU - Batch Size: 64 - Model: AlexNet

Scikit-Learn

Benchmark: Sparsify

Scikit-Learn

Benchmark: Plot Polynomial Kernel Approximation

Scikit-Learn

Benchmark: Feature Expansions

TensorFlow

Device: GPU - Batch Size: 32 - Model: GoogLeNet

TensorFlow

Device: CPU - Batch Size: 256 - Model: GoogLeNet

Scikit-Learn

Benchmark: Plot Ward

TensorFlow

Device: CPU - Batch Size: 32 - Model: VGG-16

Scikit-Learn

Benchmark: Sample Without Replacement

PyTorch

Device: CPU - Batch Size: 64 - Model: ResNet-152

PyTorch

Device: CPU - Batch Size: 256 - Model: ResNet-152

PyTorch

Device: CPU - Batch Size: 512 - Model: ResNet-152

PyTorch

Device: CPU - Batch Size: 32 - Model: ResNet-152

PyTorch

Device: CPU - Batch Size: 16 - Model: ResNet-152

Numpy Benchmark

TensorFlow

Device: CPU - Batch Size: 64 - Model: ResNet-50

Whisper.cpp

Model: ggml-base.en - Input: 2016 State of the Union

Scikit-Learn

Benchmark: Tree

TensorFlow

Device: CPU - Batch Size: 512 - Model: AlexNet

NCNN

Target: Vulkan GPU - Model: FastestDet

NCNN

Target: Vulkan GPU - Model: vision_transformer

NCNN

Target: Vulkan GPU - Model: regnety_400m

NCNN

Target: Vulkan GPU - Model: squeezenet_ssd

NCNN

Target: Vulkan GPU - Model: yolov4-tiny

NCNN

Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3

NCNN

Target: Vulkan GPU - Model: resnet50

NCNN

Target: Vulkan GPU - Model: alexnet

NCNN

Target: Vulkan GPU - Model: resnet18

NCNN

Target: Vulkan GPU - Model: vgg16

NCNN

Target: Vulkan GPU - Model: googlenet

NCNN

Target: Vulkan GPU - Model: blazeface

NCNN

Target: Vulkan GPU - Model: efficientnet-b0

NCNN

Target: Vulkan GPU - Model: mnasnet

NCNN

Target: Vulkan GPU - Model: shufflenet-v2

NCNN

Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3

NCNN

Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2

NCNN

Target: Vulkan GPU - Model: mobilenet

Scikit-Learn

Benchmark: Hist Gradient Boosting Threading

Scikit-Learn

Benchmark: SGD Regression

Scikit-Learn

Benchmark: Kernel PCA Solvers / Time vs. N Samples

PyTorch

Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Parallel

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Parallel

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Standard

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Standard

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Standard

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Standard

TensorFlow

Device: GPU - Batch Size: 32 - Model: AlexNet

oneDNN

Harness: Recurrent Neural Network Training - Engine: CPU

oneDNN

Harness: IP Shapes 1D - Engine: CPU

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Max SP Flops

oneDNN

Harness: Recurrent Neural Network Inference - Engine: CPU

Scikit-Learn

Benchmark: MNIST Dataset

TensorFlow

Device: GPU - Batch Size: 16 - Model: GoogLeNet

TensorFlow

Device: CPU - Batch Size: 16 - Model: VGG-16

Scikit-Learn

Benchmark: Plot Incremental PCA

Scikit-Learn

Benchmark: Text Vectorizers

OpenVINO

Model: Face Detection FP16 - Device: CPU

OpenVINO

Model: Face Detection FP16 - Device: CPU

OpenVINO

Model: Face Detection FP16-INT8 - Device: CPU

OpenVINO

Model: Face Detection FP16-INT8 - Device: CPU

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Standard

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Standard

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Parallel

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Parallel

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Parallel

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Parallel

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Standard

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Standard

OpenVINO

Model: Road Segmentation ADAS FP16-INT8 - Device: CPU

OpenVINO

Model: Road Segmentation ADAS FP16-INT8 - Device: CPU

OpenVINO

Model: Person Vehicle Bike Detection FP16 - Device: CPU

OpenVINO

Model: Person Vehicle Bike Detection FP16 - Device: CPU

TensorFlow Lite

Model: Inception ResNet V2

TensorFlow Lite

Model: Mobilenet Float

TensorFlow Lite

Model: Mobilenet Quant

OpenVINO

Model: Person Re-Identification Retail FP16 - Device: CPU

OpenVINO

Model: Person Re-Identification Retail FP16 - Device: CPU

OpenVINO

Model: Face Detection Retail FP16-INT8 - Device: CPU

OpenVINO

Model: Face Detection Retail FP16-INT8 - Device: CPU

OpenVINO

Model: Handwritten English Recognition FP16-INT8 - Device: CPU

OpenVINO

Model: Handwritten English Recognition FP16-INT8 - Device: CPU

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU

OpenVINO

Model: Vehicle Detection FP16-INT8 - Device: CPU

OpenVINO

Model: Vehicle Detection FP16-INT8 - Device: CPU

OpenVINO

Model: Handwritten English Recognition FP16 - Device: CPU

OpenVINO

Model: Handwritten English Recognition FP16 - Device: CPU

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU

OpenVINO

Model: Weld Porosity Detection FP16 - Device: CPU

OpenVINO

Model: Weld Porosity Detection FP16 - Device: CPU

OpenVINO

Model: Weld Porosity Detection FP16-INT8 - Device: CPU

OpenVINO

Model: Weld Porosity Detection FP16-INT8 - Device: CPU

OpenVINO

Model: Face Detection Retail FP16 - Device: CPU

OpenVINO

Model: Face Detection Retail FP16 - Device: CPU

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Parallel

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Parallel

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Standard

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Standard

TensorFlow

Device: CPU - Batch Size: 32 - Model: ResNet-50

Scikit-Learn

Benchmark: Plot OMP vs. LARS

TensorFlow

Device: GPU - Batch Size: 1 - Model: VGG-16

PyTorch

Device: CPU - Batch Size: 1 - Model: ResNet-152

TensorFlow

Device: GPU - Batch Size: 1 - Model: AlexNet

TensorFlow

Device: CPU - Batch Size: 256 - Model: AlexNet

oneDNN

Harness: IP Shapes 3D - Engine: CPU

PyTorch

Device: CPU - Batch Size: 16 - Model: ResNet-50

PyTorch

Device: CPU - Batch Size: 512 - Model: ResNet-50

PyTorch

Device: CPU - Batch Size: 256 - Model: ResNet-50

PyTorch

Device: CPU - Batch Size: 32 - Model: ResNet-50

PyTorch

Device: CPU - Batch Size: 64 - Model: ResNet-50

TensorFlow

Device: GPU - Batch Size: 16 - Model: AlexNet

Scikit-Learn

Benchmark: Kernel PCA Solvers / Time vs. N Components

DeepSpeech

Acceleration: CPU

TensorFlow

Device: CPU - Batch Size: 64 - Model: GoogLeNet

Scikit-Learn

Benchmark: LocalOutlierFactor

TensorFlow

Device: CPU - Batch Size: 16 - Model: ResNet-50

oneDNN

Harness: Deconvolution Batch shapes_1d - Engine: CPU

PyTorch

Device: CPU - Batch Size: 1 - Model: ResNet-50

TensorFlow

Device: GPU - Batch Size: 1 - Model: ResNet-50

R Benchmark

TensorFlow

Device: CPU - Batch Size: 32 - Model: GoogLeNet

Scikit-Learn

Benchmark: 20 Newsgroups / Logistic Regression

TensorFlow

Device: CPU - Batch Size: 64 - Model: AlexNet

TensorFlow

Device: CPU - Batch Size: 1 - Model: VGG-16

TensorFlow

Device: CPU - Batch Size: 16 - Model: GoogLeNet

TensorFlow

Device: CPU - Batch Size: 32 - Model: AlexNet

TensorFlow

Device: CPU - Batch Size: 1 - Model: ResNet-50

oneDNN

Harness: Convolution Batch Shapes Auto - Engine: CPU

RNNoise

Input: 26 Minute Long Talking Sample

TensorFlow

Device: CPU - Batch Size: 16 - Model: AlexNet

TensorFlow

Device: GPU - Batch Size: 1 - Model: GoogLeNet

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Texture Read Bandwidth

TensorFlow

Device: CPU - Batch Size: 1 - Model: AlexNet

TensorFlow

Device: CPU - Batch Size: 1 - Model: GoogLeNet

oneDNN

Harness: Deconvolution Batch shapes_3d - Engine: CPU

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Triad

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: GEMM SGEMM_N

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Download

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Readback

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Reduction

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: FFT SP

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: MD5 Hash

Phoronix Test Suite v10.8.5