CloudSuite Data Analytics

CloudSuite Data Analytics is a Docker-based benchmark and runs a Naive Bayes classifier on a Wikimedia dataset with Hadoop and Mahout.

To run this test with the Phoronix Test Suite, the basic command is: phoronix-test-suite benchmark cloudsuite-da.

Project Site

github.com

Source Repository

github.com

Test Created

1 November 2019

Last Updated

4 February 2023

Test Maintainer

Michael Larabel 

Test Type

System

Average Install Time

39 Seconds

Average Run Time

26 Minutes, 48 Seconds

Accolades

20k+ Downloads

Supported Platforms


Public Result Uploads *Reported Installs **Reported Test Completions **Test Profile Page Views ***OpenBenchmarking.orgEventsCloudSuite Data Analytics Popularity Statisticspts/cloudsuite-da2019.112020.012020.032020.052020.072020.102020.122021.032021.052021.072021.092021.112022.012022.032022.052022.072022.092022.112023.012023.032023.052023.072023.092023.112024.012024.038001600240032004000
* Uploading of benchmark result data to OpenBenchmarking.org is always optional (opt-in) via the Phoronix Test Suite for users wishing to share their results publicly.
** Data based on those opting to upload their test results to OpenBenchmarking.org and users enabling the opt-in anonymous statistics reporting while running benchmarks from an Internet-connected platform.
*** Test profile page view reporting began March 2021.
Data updated weekly as of 13 April 2024.
5127.8%2567.8%3215.6%416.5%6413.8%12810.2%112.0%816.5%Hadoop Slaves Option PopularityOpenBenchmarking.org

Revision History

pts/cloudsuite-da-1.2.0   [View Source]   Sat, 04 Feb 2023 07:58:29 GMT
Update against CloudSuite 4.0 upstream state.

pts/cloudsuite-da-1.1.0   [View Source]   Thu, 03 Nov 2022 17:52:22 GMT
Update Docker locations, allow Hadoop slave count configuration. This test though doesn't seem too useful/good...

pts/cloudsuite-da-1.0.0   [View Source]   Fri, 01 Nov 2019 17:55:39 GMT
Initial commit of CloudSuite Data Analytics benchmark.


Performance Metrics

Analyze Test Configuration:

CloudSuite Data Analytics 4.0

Hadoop Slaves: 4

OpenBenchmarking.org metrics for this test profile configuration based on 55 public results since 4 February 2023 with the latest data as of 26 April 2023.

Below is an overview of the generalized performance for components where there is sufficient statistically significant data based upon user-uploaded results. It is important to keep in mind particularly in the Linux/open-source space there can be vastly different OS configurations, with this overview intended to offer just general guidance as to the performance expectations.

Component
Percentile Rank
# Compatible Public Results
ms (Average)
79th
4
283449 +/- 4723
Mid-Tier
75th
> 285805
66th
3
901784 +/- 1686
57th
3
915989 +/- 2771
Median
50th
943077
50th
3
943978 +/- 784
26th
3
1307377 +/- 4471
Low-Tier
25th
> 1308725
11th
4
1431513 +/- 141431
8th
3
1490669 +/- 2931
OpenBenchmarking.orgDistribution Of Public Results - Hadoop Slaves: 455 Results Range From 18119 To 1553995 ms18119488377955511027314099117170920242723314526386329458132529935601738673541745344817147888950960754032557104360176163247966319769391572463375535178606981678784750587822390894193965997037710010951031813106253110932491123967115468511854031216121124683912775571308275133899313697111400429143114714618651492583152330115540193691215

Based on OpenBenchmarking.org data, the selected test / test configuration (CloudSuite Data Analytics 4.0 - Hadoop Slaves: 4) has an average run-time of 16 minutes. By default this test profile is set to run at least 1 times but may increase if the standard deviation exceeds pre-defined defaults or other calculations deem additional runs necessary for greater statistical accuracy of the result.

OpenBenchmarking.orgMinutesTime Required To Complete BenchmarkHadoop Slaves: 4Run-Time612182430Min: 1 / Avg: 15.75 / Max: 27

Tested CPU Architectures

This benchmark has been successfully tested on the below mentioned architectures. The CPU architectures listed is where successful OpenBenchmarking.org result uploads occurred, namely for helping to determine if a given test is compatible with various alternative CPU architectures.

CPU Architecture
Kernel Identifier
Verified On
Intel / AMD x86 64-bit
x86_64
(Many Processors)