ACES DGEMM

This is a multi-threaded DGEMM benchmark.

To run this test with the Phoronix Test Suite, the basic command is: phoronix-test-suite benchmark mt-dgemm.

Project Site

lanl.gov

Test Created

11 October 2019

Test Maintainer

Michael Larabel 

Test Type

Processor

Average Install Time

2 Seconds

Average Run Time

15 Minutes, 11 Seconds

Test Dependencies

C/C++ Compiler Toolchain

Accolades

5k+ Downloads

Supported Platforms


Public Result Uploads *Reported Installs **Reported Test Completions **Test Profile Page Views ***OpenBenchmarking.orgEventsACES DGEMM Popularity Statisticspts/mt-dgemm2019.102019.112019.122020.012020.022020.032020.042020.052020.062020.072020.082020.092020.102020.112020.122021.012021.022021.032021.042021.052021.062021.07400800120016002000
* Uploading of benchmark result data to OpenBenchmarking.org is always optional (opt-in) via the Phoronix Test Suite for users wishing to share their results publicly.
** Data based on those opting to upload their test results to OpenBenchmarking.org and users enabling the opt-in anonymous statistics reporting while running benchmarks from an Internet-connected platform.
*** Test profile page view reporting began March 2021.
Data current as of 31 July 2021.

Revision History

pts/mt-dgemm-1.2.0   [View Source]   Fri, 11 Oct 2019 15:29:15 GMT
Initial commit of ACES DGEMM

Suites Using This Test

Multi-Core

Scientific Computing

HPC - High Performance Computing

Linear Algebra

Programmer / Developer System Benchmarks


Performance Metrics

Analyze Test Configuration:

ACES DGEMM 1.0

Sustained Floating-Point Rate

OpenBenchmarking.org metrics for this test profile configuration based on 1,419 public results since 11 October 2019 with the latest data as of 30 July 2021.

Below is an overview of the generalized performance for components where there is sufficient statistically significant data based upon user-uploaded results. It is important to keep in mind particularly in the Linux/open-source space there can be vastly different OS configurations, with this overview intended to offer just general guidance as to the performance expectations.

Component
Percentile Rank
# Compatible Public Results
GFLOP/s (Average)
100th
20
38.3 +/- 0.5
100th
12
31.8 +/- 0.3
97th
27
27.4 +/- 3.2
96th
10
23.2 +/- 0.4
96th
3
22.5 +/- 1.1
96th
11
21.5 +/- 0.1
94th
5
20.5 +/- 0.3
92nd
4
20.1 +/- 1.7
92nd
11
20.1 +/- 1.0
92nd
11
19.8 +/- 0.7
90th
5
18.8 +/- 0.1
89th
6
17.5 +/- 1.0
87th
12
16.7 +/- 0.2
87th
8
16.3 +/- 0.3
86th
3
16.2 +/- 0.4
86th
3
15.8 +/- 0.3
85th
12
15.5 +/- 0.3
85th
7
15.1 +/- 0.2
85th
4
15.1 +/- 1.7
82nd
33
13.9 +/- 0.2
81st
7
13.6 +/- 0.4
80th
7
13.4 +/- 0.4
80th
4
13.3 +/- 1.6
79th
11
12.7 +/- 0.1
78th
6
12.6 +/- 0.5
77th
10
12.2 +/- 0.1
77th
6
12.1 +/- 0.6
76th
6
11.7 +/- 0.8
Mid-Tier
75th
< 11.3
75th
7
11.3 +/- 0.3
75th
10
11.3 +/- 0.3
75th
4
11.2 +/- 0.2
74th
15
10.9 +/- 1.6
73rd
5
10.7 +/- 0.4
72nd
9
10.6 +/- 1.1
72nd
7
10.6 +/- 0.1
72nd
3
10.6 +/- 0.2
71st
6
10.2 +/- 0.6
70th
11
9.8 +/- 0.1
66th
8
9.4 +/- 0.4
64th
14
9.2 +/- 0.3
63rd
9
9.1 +/- 0.2
61st
7
8.9 +/- 0.1
61st
42
8.8 +/- 0.8
60th
7
8.3 +/- 0.2
59th
3
7.9 +/- 0.1
57th
8
7.6 +/- 0.2
55th
17
7.3 +/- 0.3
54th
3
7.2 +/- 0.2
53rd
9
6.8 +/- 0.3
52nd
3
6.6 +/- 0.2
52nd
3
6.5 +/- 0.2
Median
50th
6.3
50th
24
6.3 +/- 0.2
49th
4
6.2 +/- 0.2
48th
7
5.9 +/- 0.2
48th
11
5.9 +/- 0.5
46th
6
5.7 +/- 0.4
46th
3
5.7 +/- 0.1
46th
15
5.5 +/- 0.1
44th
10
5.4 +/- 0.6
43rd
13
5.2 +/- 0.2
42nd
3
5.0 +/- 0.1
41st
13
4.9 +/- 0.3
41st
16
4.7 +/- 0.5
38th
6
4.4 +/- 0.3
38th
6
4.4 +/- 0.1
36th
5
4.2 +/- 0.2
34th
29
4.1 +/- 0.5
33rd
10
3.8 +/- 0.3
32nd
5
3.6 +/- 0.4
31st
9
3.5 +/- 0.1
30th
9
3.3 +/- 0.2
Low-Tier
25th
< 2.7
24th
10
2.6 +/- 0.2
24th
4
2.5 +/- 0.1
22nd
10
2.3 +/- 0.1
22nd
6
2.3 +/- 0.1
21st
5
2.1 +/- 0.1
19th
10
1.8 +/- 0.1
18th
3
1.7 +/- 0.1
17th
18
1.6 +/- 0.1
14th
7
1.3 +/- 0.2
OpenBenchmarking.orgDistribution Of Public Results - Sustained Floating-Point Rate1419 Results Range From 0 To 40 GFLOP/s04812162024283236404448100200300400500

Based on OpenBenchmarking.org data, the selected test / test configuration (ACES DGEMM 1.0 - Sustained Floating-Point Rate) has an average run-time of 7 minutes. By default this test profile is set to run at least 3 times but may increase if the standard deviation exceeds pre-defined defaults or other calculations deem additional runs necessary for greater statistical accuracy of the result.

OpenBenchmarking.orgMinutesTime Required To Complete BenchmarkSustained Floating-Point RateRun-Time20406080100Min: 1 / Avg: 6.31 / Max: 90

Based on public OpenBenchmarking.org results, the selected test / test configuration has an average standard deviation of 1.3%.

OpenBenchmarking.orgPercent, Fewer Is BetterAverage Deviation Between RunsSustained Floating-Point RateDeviation3691215Min: 0 / Avg: 1.3 / Max: 8

Does It Scale Well With Increasing Cores?

Yes, based on the automated analysis of the collected public benchmark data, this test / test settings does generally scale well with increasing CPU core counts. Data based on publicly available results for this test / test settings, separated by vendor, result divided by the reference CPU clock speed, grouped by matching physical CPU core count, and normalized against the smallest core count tested from each vendor for each CPU having a sufficient number of test samples and statistically significant data.

IntelAMDOpenBenchmarking.orgRelative Core Scaling To BaseACES DGEMM CPU Core ScalingSustained Floating-Point Rate468121618202432486496128918273645

Notable Instruction Set Usage

Notable instruction set extensions supported by this test, based on an automatic analysis by the Phoronix Test Suite / OpenBenchmarking.org analytics engine.

Instruction Set
Support
Instructions Detected
Used by default on supported hardware.
Found on Intel processors since Sandy Bridge (2011).
Found on AMD processors since Bulldozer (2011).

 
VZEROUPPER VEXTRACTF128 VINSERTF128
FMA (FMA)
Used by default on supported hardware.
Found on Intel processors since Haswell (2013).
Found on AMD processors since Bulldozer (2011).

 
VFMADD132SD VFMADD231SD
The test / benchmark does honor compiler flag changes.
Last automated analysis: 10 May 2021

This test profile binary relies on the shared libraries libgomp.so.1, libc.so.6, libdl.so.2, libpthread.so.0.

Recent Test Results

OpenBenchmarking.org Results Compare

1 System - 1 Benchmark Result

Intel Pentium N3710 - AMI Aptio CRB - Intel Atom

CentOS Linux 7 - 3.10.0-1160.el7.x86_64 - GNOME Shell 3.28.3

1 System - 5 Benchmark Results

Intel Pentium N3710 - AMI Aptio CRB - Intel Atom

CentOS Linux 7 - 3.10.0-1160.el7.x86_64 - GNOME Shell 3.28.3

1 System - 6 Benchmark Results

Intel Pentium N3710 - AMI Aptio CRB - Intel Atom

CentOS Linux 7 - 3.10.0-1160.el7.x86_64 - GNOME Shell 3.28.3

1 System - 7 Benchmark Results

Intel Pentium N3710 - AMI Aptio CRB - Intel Atom

CentOS Linux 7 - 3.10.0-1160.el7.x86_64 - GNOME Shell 3.28.3

1 System - 5 Benchmark Results

Intel Pentium N3710 - AMI Aptio CRB - Intel Atom

CentOS Linux 7 - 3.10.0-1160.el7.x86_64 - GNOME Shell 3.28.3

1 System - 7 Benchmark Results

Intel Pentium N3710 - AMI Aptio CRB - Intel Atom

CentOS Linux 7 - 3.10.0-1160.el7.x86_64 - GNOME Shell 3.28.3

3 Systems - 240 Benchmark Results

2 x AMD EPYC 7543 32-Core - HPE ProLiant DL385 Gen10 Plus v2 - AMD Starship

RedHatEnterpriseServer 7.9 - 3.10.0-1160.31.1.el7.x86_64 - GNOME Shell 3.28.3

3 Systems - 240 Benchmark Results

AMD EPYC 72F3 8-Core - Supermicro H12SSL-i v1.01 - AMD Starship

Ubuntu 21.04 - 5.11.0-16-generic - GNOME Shell 3.38.4

1 System - 162 Benchmark Results

2 x AMD EPYC 7763 64-Core - AMD DAYTONA_X - AMD Starship

Ubuntu 21.04 - 5.14.0-rc1-vanilla - X Server 1.20.11

1 System - 2 Benchmark Results

AMD EPYC 7313 16-Core - Supermicro H12SSL-i v1.01 - AMD Starship

Debian 10 - 4.19.0-17-amd64 - GCC 8.3.0 + Open64 PARSE ERROR

3 Systems - 272 Benchmark Results

4 x Intel Xeon Gold 6254 - HPE ProLiant DL560 Gen10 - Intel Sky Lake-E DMI3 Registers

RedHatEnterpriseServer 7.9 - 3.10.0-1160.31.1.el7.x86_64 - GNOME Shell 3.28.3

1 System - 4 Benchmark Results

AMD EPYC 7313 16-Core - Supermicro H12SSL-i v1.01 - AMD Starship

Debian 10 - 4.19.0-17-amd64 - GCC 8.3.0 + Open64 PARSE ERROR

Most Popular Test Results

OpenBenchmarking.org Results Compare

3 Systems - 268 Benchmark Results

Intel Core i5-2520M - HP 161C - Intel 2nd Generation Core DRAM

Ubuntu 18.04 - 4.18.0-20-generic - GNOME Shell 3.28.3

11 Systems - 217 Benchmark Results

AMD Ryzen 5 2600 Six-Core - ASUS ROG CROSSHAIR VIII HERO - AMD 17h

Ubuntu 20.04 - 5.9.0-050900-generic - GNOME Shell 3.36.4

12 Systems - 593 Benchmark Results

AMD Ryzen 7 3700X 8-Core - MSI MEG X570 GODLIKE - AMD Starship

Ubuntu 20.04 - 5.8.0-050800daily20200622-generic - GNOME Shell 3.36.2

8 Systems - 360 Benchmark Results

AMD Ryzen Threadripper 2990WX 32-Core - ASUS ROG ZENITH EXTREME - AMD 17h

Ubuntu 19.10 - 5.4.0-999-generic - GNOME Shell 3.34.1

15 Systems - 38 Benchmark Results

2 x Intel Xeon Gold 6258R - Supermicro X11DAi-N v1.10 - Intel Sky Lake-E DMI3 Registers

Fedora 32 - 5.6.14-300.fc32.x86_64 - GNOME Shell 3.36.2

3 Systems - 301 Benchmark Results

Intel Core i5-4670 - MSI B85M-P33 - Intel 4th Gen Core DRAM

Ubuntu 20.04 - 5.4.0-40-generic - GNOME Shell 3.36.3

7 Systems - 62 Benchmark Results

Intel Core i9-7960X - 15360MB - 2 x 275GB Virtual Disk

Ubuntu 18.04 - 4.19.75-microsoft-standard - GCC 7.4.0

12 Systems - 229 Benchmark Results

AMD Ryzen 5 2600X Six-Core - ASUS ROG CROSSHAIR VIII HERO - AMD 17h

Ubuntu 20.04 - 5.9.0-050900-generic - GNOME Shell 3.36.4

2 Systems - 59 Benchmark Results

AMD Ryzen 7 3700X 8-Core - MSI MEG X570 GODLIKE - AMD Device 1480

Clear Linux OS 31480 - 5.3.8-854.native - GNOME Shell 3.34.1

18 Systems - 115 Benchmark Results

Intel Core i9-9900KS - ASUS PRIME Z390-A - Intel Cannon Lake PCH

Ubuntu 19.10 - 5.3.0-19-generic - GNOME Shell 3.34.1

Find More Test Results