Tau T2A 8 16 32 vCPU Scaling

Benchmarks by Michael Larabel for a future article

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2208120-PTS-2208123N58
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results
Show Result Confidence Charts
Allow Limiting Results To Certain Suite(s)

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Prefer Vertical Bar Graphs
On Line Graphs With Missing Data, Connect The Line Gaps

Multi-Way Comparison

Condense Comparison
Transpose Comparison

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Toggle/Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
Tau T2A: 8 vCPUs
August 10 2022
  1 Day, 5 Hours, 56 Minutes
Tau T2A: 16 vCPUs
August 10 2022
  1 Day, 1 Hour, 45 Minutes
Tau T2A: 32 vCPUs
August 11 2022
  1 Day, 6 Hours, 6 Minutes
Invert Behavior (Only Show Selected Data)
  1 Day, 4 Hours, 35 Minutes

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


ProcessorMotherboardMemoryDiskNetworkOSKernelCompilerFile-SystemSystem LayerTau T2A 16 vCPUs 8 vCPUs 32 vCPUsARMv8 Neoverse-N1 (16 Cores)KVM Google Compute Engine64GB215GB nvme_card-pdGoogle Compute Engine VirtualUbuntu 22.045.15.0-1013-gcp (aarch64)GCC 12.0.1 20220319ext4KVMARMv8 Neoverse-N1 (8 Cores)32GBARMv8 Neoverse-N1 (32 Cores)128GB5.15.0-1016-gcp (aarch64)OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseEnvironment Details- CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Compiler Details- --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v Java Details- OpenJDK Runtime Environment (build 11.0.16+8-post-Ubuntu-0ubuntu122.04)Python Details- Python 3.10.4Security Details- Tau T2A: 16 vCPUs: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected - Tau T2A: 8 vCPUs: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected - Tau T2A: 32 vCPUs: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected

graph500: 26graph500: 26stress-ng: Futexstress-ng: CPU Cachestress-ng: CPU Stressstress-ng: Matrix Mathstress-ng: Vector Mathstress-ng: System V Message Passingopenssl: SHA256sysbench: CPUvpxenc: Speed 5 - Bosphorus 4Kvpxenc: Speed 0 - Bosphorus 1080pvpxenc: Speed 5 - Bosphorus 1080phpcg: askap: Hogbom Clean OpenMPcoremark: CoreMark Size 666 - Iterations Per Secondspec-jbb2015: SPECjbb2015-Composite max-jOPSspec-jbb2015: SPECjbb2015-Composite critical-jOPSaircrack-ng: askap: tConvolve MT - Griddingaskap: tConvolve MT - Degriddingaskap: tConvolve OpenMP - Griddingaskap: tConvolve OpenMP - Degriddingaskap: tConvolve MPI - Degriddingaskap: tConvolve MPI - Griddinggromacs: MPI CPU - water_GMX50_barelammps: 20k Atomslammps: Rhodopsin Proteincassandra: Writesrocksdb: Rand Readrocksdb: Read While Writingrocksdb: Read Rand Write Randopenssl: RSA4096graph500: 26graph500: 26npb: BT.Cnpb: CG.Cnpb: EP.Dnpb: FT.Cnpb: IS.Dnpb: LU.Cnpb: MG.Cnpb: SP.Bnpb: SP.Cpgbench: 100 - 100 - Read Onlypgbench: 100 - 250 - Read Onlyopenssl: RSA4096tensorflow-lite: SqueezeNettensorflow-lite: Inception V4tensorflow-lite: Mobilenet Floattensorflow-lite: Inception ResNet V2renaissance: Apache Spark Bayesrenaissance: Savina Reactors.IOpgbench: 100 - 100 - Read Only - Average Latencypgbench: 100 - 250 - Read Only - Average Latencytnn: CPU - DenseNettnn: CPU - MobileNet v2dacapobench: Tradesoapopenfoam: drivaerFastback, Medium Mesh Size - Mesh Timeopenfoam: drivaerFastback, Medium Mesh Size - Execution Timeavifenc: 0avifenc: 2avifenc: 6avifenc: 6, Losslessavifenc: 10, Losslessbuild-ffmpeg: Time To Compilebuild-gem5: Time To Compilebuild-mplayer: Time To Compilespark: 1000000 - 100 - SHA-512 Benchmark Timespark: 1000000 - 100 - Calculate Pi Benchmarkspark: 1000000 - 100 - Calculate Pi Benchmark Using Dataframespark: 1000000 - 100 - Repartition Test Timespark: 1000000 - 100 - Inner Join Test Timespark: 1000000 - 100 - Broadcast Inner Join Test Timespark: 1000000 - 2000 - SHA-512 Benchmark Timespark: 1000000 - 2000 - Calculate Pi Benchmarkspark: 1000000 - 2000 - Calculate Pi Benchmark Using Dataframespark: 1000000 - 2000 - Group By Test Timespark: 1000000 - 2000 - Repartition Test Timespark: 1000000 - 2000 - Inner Join Test Timespark: 1000000 - 2000 - Broadcast Inner Join Test Timespark: 40000000 - 100 - SHA-512 Benchmark Timespark: 40000000 - 100 - Calculate Pi Benchmarkspark: 40000000 - 100 - Calculate Pi Benchmark Using Dataframespark: 40000000 - 100 - Group By Test Timespark: 40000000 - 100 - Repartition Test Timespark: 40000000 - 100 - Inner Join Test Timespark: 40000000 - 100 - Broadcast Inner Join Test Timespark: 40000000 - 2000 - SHA-512 Benchmark Timespark: 40000000 - 2000 - Calculate Pi Benchmarkspark: 40000000 - 2000 - Calculate Pi Benchmark Using Dataframespark: 40000000 - 2000 - Group By Test Timespark: 40000000 - 2000 - Repartition Test Timespark: 40000000 - 2000 - Inner Join Test Timespark: 40000000 - 2000 - Broadcast Inner Join Test Timeastcenc: Mediumastcenc: Thoroughastcenc: Exhaustivegpaw: Carbon Nanotubeblender: BMW27 - CPU-Onlyblender: Classroom - CPU-Onlyblender: Fishy Cat - CPU-OnlyTau T2A 16 vCPUs 8 vCPUs 32 vCPUs2625630002574780001198681.87551.254116.9676177.5649102.305475267.361292641152754317.426.684.8411.7817.0984645.161351562.54315818092920716697.9243789.014083.163631.815023.72585.313343.250.8808.4998.86139296620489671264826884700786.7952655007075020049125.9312171.951634.9932644.851498.4555447.3133309.7619552.4519710.9015789413160764247.03955.8946113.42481.9645445.91262.015981.50.6331.9003358.728328.8865598303.711534.72328.970194.77411.16814.7027.65861.983495.54147.6214.93137.7612851908.392.552.221.795.91137.1739761998.297.433.363.652.6551.40136.8496130198.4035.6437.2244.6245.2951.55137.0402077248.3430.7035.4544.3942.756.944914.2146137.6218208.965226.26506.10426.04937451.99436.312065.5338215.9524633.994507844.17645608350727237.286.114.6511.2711.0969371.295175037.765143915839218308.5802360.632196.742296.102421.431325.061977.890.454.6624.8121786231055689594702548353393.714368.296855.81820.9418574.231104.2632029.1427703.337338.987115.28542374962832136.86618.3297646.14395.7391592.62249.826456.41.8445.0453842.123331.3448040425.952426.16456.237245.60120.27323.3129.997113.455917.39388.8386.28277.8915.924.583.482.868.09278.47252722915.818.855.675.985.1893.67278.36684567715.6550.8768.6880.0280.2689.23277.8315.7145.6166.2778.0074.719.050529.0505276.8798381.196447.711016.66841.185083720004773770001437660.62566.918209.47151792.8397749.086128517.1025788919913108241.616.994.9912.1122.0930996.700700917.944737350752295533647.5484456.555522.077262.749181.243962.083899.281.71816.55016.59687819124704201261099213218271570.216954200012470200069530.6421433.923265.6852309.811822.7787702.3050939.0534381.9126843.58329539312239128273.13853.9031657.32093.2533994.9766.410705.90.3040.8033056.897322.7685015206.4994.53266.337169.6396.68210.3416.77538.958312.12028.9284.7969.774.792.012.131.684.9669.924.806.722.602.872.1246.3069.574.7627.6424.3630.3231.9839.2269.794.7822.8422.2228.6626.555.98257.161968.6557130.353112.47249.89214.41OpenBenchmarking.org

Graph500

This is a benchmark of the reference implementation of Graph500, an HPC benchmark focused on data intensive loads and commonly tested on supercomputers for complex data problems. Graph500 primarily stresses the communication subsystem of the hardware under test. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgbfs max_TEPS, More Is BetterGraph500 3.0Scale: 2632 vCPUs16 vCPUs110M220M330M440M550M5083720002625630001. (CC) gcc options: -fcommon -O3 -march=native -lpthread -lm -lmpi

Scale: 26

Tau T2A: 8 vCPUs: The test quit with a non-zero exit status. E: mpirun noticed that process rank 2 with PID 0 on node instance-2 exited on signal 9 (Killed).

OpenBenchmarking.orgbfs median_TEPS, More Is BetterGraph500 3.0Scale: 2632 vCPUs16 vCPUs100M200M300M400M500M4773770002574780001. (CC) gcc options: -fcommon -O3 -march=native -lpthread -lm -lmpi

Scale: 26

Tau T2A: 8 vCPUs: The test quit with a non-zero exit status. E: mpirun noticed that process rank 2 with PID 0 on node instance-2 exited on signal 9 (Killed).

Stress-NG

Stress-NG is a Linux stress tool developed by Colin King of Canonical. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.14Test: Futex32 vCPUs8 vCPUs16 vCPUs300K600K900K1200K1500KSE +/- 15026.23, N = 3SE +/- 36001.77, N = 15SE +/- 30917.20, N = 151437660.62937451.991198681.871. (CC) gcc options: -O3 -march=native -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.14Test: CPU Cache32 vCPUs8 vCPUs16 vCPUs120240360480600SE +/- 0.28, N = 3SE +/- 2.30, N = 3SE +/- 2.05, N = 3566.91436.31551.251. (CC) gcc options: -O3 -march=native -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.14Test: CPU Stress32 vCPUs8 vCPUs16 vCPUs2K4K6K8K10KSE +/- 4.23, N = 3SE +/- 1.28, N = 3SE +/- 2.80, N = 38209.472065.534116.961. (CC) gcc options: -O3 -march=native -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.14Test: Matrix Math32 vCPUs8 vCPUs16 vCPUs30K60K90K120K150KSE +/- 9.80, N = 3SE +/- 25.04, N = 3SE +/- 10.44, N = 3151792.8338215.9576177.561. (CC) gcc options: -O3 -march=native -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.14Test: Vector Math32 vCPUs8 vCPUs16 vCPUs20K40K60K80K100KSE +/- 190.70, N = 3SE +/- 6.31, N = 3SE +/- 27.43, N = 397749.0824633.9949102.301. (CC) gcc options: -O3 -march=native -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.14Test: System V Message Passing32 vCPUs8 vCPUs16 vCPUs1.3M2.6M3.9M5.2M6.5MSE +/- 7551.56, N = 3SE +/- 12538.93, N = 3SE +/- 15929.45, N = 36128517.104507844.175475267.361. (CC) gcc options: -O3 -march=native -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread

OpenSSL

OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.0Algorithm: SHA25632 vCPUs8 vCPUs16 vCPUs6000M12000M18000M24000M30000MSE +/- 119493320.18, N = 3SE +/- 19026629.44, N = 3SE +/- 19283388.31, N = 3257889199136456083507129264115271. (CC) gcc options: -pthread -O3 -march=native -lssl -lcrypto -ldl

Sysbench

This is a benchmark of Sysbench with the built-in CPU and memory sub-tests. Sysbench is a scriptable multi-threaded benchmark tool based on LuaJIT. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgEvents Per Second, More Is BetterSysbench 1.0.20Test: CPU32 vCPUs8 vCPUs16 vCPUs20K40K60K80K100KSE +/- 23.77, N = 3SE +/- 6.95, N = 3SE +/- 12.70, N = 3108241.6127237.2854317.421. (CC) gcc options: -O2 -funroll-loops -O3 -march=native -rdynamic -ldl -laio -lm

VP9 libvpx Encoding

This is a standard video encoding performance test of Google's libvpx library and the vpxenc command for the VP9 video format. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterVP9 libvpx Encoding 1.10.0Speed: Speed 5 - Input: Bosphorus 4K32 vCPUs8 vCPUs16 vCPUs246810SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 36.996.116.681. (CXX) g++ options: -lm -lpthread -O3 -march=native -march=armv8-a -fPIC -U_FORTIFY_SOURCE -std=gnu++11

OpenBenchmarking.orgFrames Per Second, More Is BetterVP9 libvpx Encoding 1.10.0Speed: Speed 0 - Input: Bosphorus 1080p32 vCPUs8 vCPUs16 vCPUs1.12282.24563.36844.49125.614SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 34.994.654.841. (CXX) g++ options: -lm -lpthread -O3 -march=native -march=armv8-a -fPIC -U_FORTIFY_SOURCE -std=gnu++11

OpenBenchmarking.orgFrames Per Second, More Is BetterVP9 libvpx Encoding 1.10.0Speed: Speed 5 - Input: Bosphorus 1080p32 vCPUs8 vCPUs16 vCPUs3691215SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 312.1111.2711.781. (CXX) g++ options: -lm -lpthread -O3 -march=native -march=armv8-a -fPIC -U_FORTIFY_SOURCE -std=gnu++11

High Performance Conjugate Gradient

HPCG is the High Performance Conjugate Gradient and is a new scientific benchmark from Sandia National Lans focused for super-computer testing with modern real-world workloads compared to HPCC. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.132 vCPUs8 vCPUs16 vCPUs510152025SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.03, N = 322.0911.1017.101. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

ASKAP

ASKAP is a set of benchmarks from the Australian SKA Pathfinder. The principal ASKAP benchmarks are the Hogbom Clean Benchmark (tHogbomClean) and Convolutional Resamping Benchmark (tConvolve) as well as some previous ASKAP benchmarks being included as well for OpenCL and CUDA execution of tConvolve. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgIterations Per Second, More Is BetterASKAP 1.0Test: Hogbom Clean OpenMP32 vCPUs8 vCPUs16 vCPUs2004006008001000SE +/- 3.30, N = 3SE +/- 1.21, N = 3SE +/- 0.00, N = 3996.70371.30645.161. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

Coremark

This is a test of EEMBC CoreMark processor benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgIterations/Sec, More Is BetterCoremark 1.0CoreMark Size 666 - Iterations Per Second32 vCPUs8 vCPUs16 vCPUs150K300K450K600K750KSE +/- 385.56, N = 3SE +/- 85.46, N = 3SE +/- 87.09, N = 3700917.94175037.77351562.541. (CC) gcc options: -O2 -O3 -march=native -lrt" -lrt

SPECjbb 2015

This is a benchmark of SPECjbb 2015. For this test profile to work, you must have a valid license/copy of the SPECjbb 2015 ISO (SPECjbb2015-1.02.iso) in your Phoronix Test Suite download cache. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgjOPS, More Is BetterSPECjbb 2015SPECjbb2015-Composite max-jOPS32 vCPUs8 vCPUs16 vCPUs8K16K24K32K40K35075915818092

OpenBenchmarking.orgjOPS, More Is BetterSPECjbb 2015SPECjbb2015-Composite critical-jOPS32 vCPUs8 vCPUs16 vCPUs5K10K15K20K25K2295539219207

Aircrack-ng

Aircrack-ng is a tool for assessing WiFi/WLAN network security. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgk/s, More Is BetterAircrack-ng 1.732 vCPUs8 vCPUs16 vCPUs7K14K21K28K35KSE +/- 287.54, N = 15SE +/- 103.85, N = 15SE +/- 192.97, N = 1533647.558308.5816697.92-lpcre-lpcre1. (CXX) g++ options: -std=gnu++17 -O3 -fvisibility=hidden -fcommon -rdynamic -lnl-3 -lnl-genl-3 -lpthread -lz -lssl -lcrypto -lhwloc -ldl -lm -pthread

ASKAP

ASKAP is a set of benchmarks from the Australian SKA Pathfinder. The principal ASKAP benchmarks are the Hogbom Clean Benchmark (tHogbomClean) and Convolutional Resamping Benchmark (tConvolve) as well as some previous ASKAP benchmarks being included as well for OpenCL and CUDA execution of tConvolve. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve MT - Gridding32 vCPUs8 vCPUs16 vCPUs10002000300040005000SE +/- 35.89, N = 15SE +/- 5.73, N = 3SE +/- 5.95, N = 34456.552360.633789.011. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve MT - Degridding32 vCPUs8 vCPUs16 vCPUs12002400360048006000SE +/- 80.56, N = 15SE +/- 7.87, N = 3SE +/- 2.61, N = 35522.072196.744083.161. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve OpenMP - Gridding32 vCPUs8 vCPUs16 vCPUs16003200480064008000SE +/- 66.63, N = 3SE +/- 29.91, N = 3SE +/- 43.40, N = 37262.742296.103631.811. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve OpenMP - Degridding32 vCPUs8 vCPUs16 vCPUs2K4K6K8K10KSE +/- 0.00, N = 3SE +/- 33.24, N = 3SE +/- 0.00, N = 39181.242421.435023.701. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

OpenBenchmarking.orgMpix/sec, More Is BetterASKAP 1.0Test: tConvolve MPI - Degridding32 vCPUs8 vCPUs16 vCPUs8001600240032004000SE +/- 54.84, N = 15SE +/- 24.91, N = 15SE +/- 12.80, N = 33962.081325.062585.311. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

OpenBenchmarking.orgMpix/sec, More Is BetterASKAP 1.0Test: tConvolve MPI - Gridding32 vCPUs8 vCPUs16 vCPUs8001600240032004000SE +/- 42.99, N = 15SE +/- 23.42, N = 15SE +/- 32.26, N = 33899.281977.893343.251. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

GROMACS

The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2022.1Implementation: MPI CPU - Input: water_GMX50_bare32 vCPUs8 vCPUs16 vCPUs0.38660.77321.15981.54641.933SE +/- 0.010, N = 3SE +/- 0.000, N = 3SE +/- 0.001, N = 31.7180.4500.8801. (CXX) g++ options: -O3 -march=native

LAMMPS Molecular Dynamics Simulator

LAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 23Jun2022Model: 20k Atoms32 vCPUs8 vCPUs16 vCPUs48121620SE +/- 0.004, N = 3SE +/- 0.025, N = 3SE +/- 0.112, N = 316.5504.6628.4991. (CXX) g++ options: -O3 -march=native -ldl

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 23Jun2022Model: Rhodopsin Protein32 vCPUs8 vCPUs16 vCPUs48121620SE +/- 0.012, N = 3SE +/- 0.011, N = 3SE +/- 0.037, N = 316.5964.8128.8611. (CXX) g++ options: -O3 -march=native -ldl

Apache Cassandra

This is a benchmark of the Apache Cassandra NoSQL database management system making use of cassandra-stress. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgOp/s, More Is BetterApache Cassandra 4.0Test: Writes32 vCPUs8 vCPUs16 vCPUs20K40K60K80K100KSE +/- 777.36, N = 3SE +/- 136.95, N = 10SE +/- 256.55, N = 3878191786239296

Facebook RocksDB

This is a benchmark of Facebook's RocksDB as an embeddable persistent key-value store for fast storage based on Google's LevelDB. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 7.0.1Test: Random Read32 vCPUs8 vCPUs16 vCPUs30M60M90M120M150MSE +/- 376574.31, N = 3SE +/- 252880.06, N = 3SE +/- 735054.27, N = 312470420131055689620489671. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 7.0.1Test: Read While Writing32 vCPUs8 vCPUs16 vCPUs600K1200K1800K2400K3000KSE +/- 32390.32, N = 12SE +/- 9067.95, N = 15SE +/- 20446.66, N = 15261099259470212648261. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 7.0.1Test: Read Random Write Random32 vCPUs8 vCPUs16 vCPUs300K600K900K1200K1500KSE +/- 9643.50, N = 15SE +/- 4976.00, N = 15SE +/- 1701.74, N = 313218275483538847001. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

OpenSSL

OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgsign/s, More Is BetterOpenSSL 3.0Algorithm: RSA409632 vCPUs8 vCPUs16 vCPUs30060090012001500SE +/- 0.06, N = 3SE +/- 0.07, N = 3SE +/- 0.06, N = 31570.2393.7786.71. (CC) gcc options: -pthread -O3 -march=native -lssl -lcrypto -ldl

Graph500

This is a benchmark of the reference implementation of Graph500, an HPC benchmark focused on data intensive loads and commonly tested on supercomputers for complex data problems. Graph500 primarily stresses the communication subsystem of the hardware under test. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgsssp max_TEPS, More Is BetterGraph500 3.0Scale: 2632 vCPUs16 vCPUs40M80M120M160M200M169542000952655001. (CC) gcc options: -fcommon -O3 -march=native -lpthread -lm -lmpi

Scale: 26

Tau T2A: 8 vCPUs: The test quit with a non-zero exit status. E: mpirun noticed that process rank 2 with PID 0 on node instance-2 exited on signal 9 (Killed).

OpenBenchmarking.orgsssp median_TEPS, More Is BetterGraph500 3.0Scale: 2632 vCPUs16 vCPUs30M60M90M120M150M124702000707502001. (CC) gcc options: -fcommon -O3 -march=native -lpthread -lm -lmpi

Scale: 26

Tau T2A: 8 vCPUs: The test quit with a non-zero exit status. E: mpirun noticed that process rank 2 with PID 0 on node instance-2 exited on signal 9 (Killed).

NAS Parallel Benchmarks

NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: BT.C32 vCPUs8 vCPUs16 vCPUs15K30K45K60K75KSE +/- 272.46, N = 3SE +/- 23.11, N = 3SE +/- 18.18, N = 369530.6414368.2949125.931. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: CG.C32 vCPUs8 vCPUs16 vCPUs5K10K15K20K25KSE +/- 35.67, N = 3SE +/- 49.63, N = 15SE +/- 171.15, N = 321433.926855.8112171.951. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: EP.D32 vCPUs8 vCPUs16 vCPUs7001400210028003500SE +/- 2.04, N = 3SE +/- 0.56, N = 3SE +/- 1.03, N = 33265.68820.941634.991. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: FT.C32 vCPUs8 vCPUs16 vCPUs11K22K33K44K55KSE +/- 41.18, N = 3SE +/- 15.96, N = 3SE +/- 300.01, N = 352309.8118574.2332644.851. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: IS.D32 vCPUs8 vCPUs16 vCPUs400800120016002000SE +/- 0.86, N = 3SE +/- 1.14, N = 3SE +/- 14.70, N = 31822.771104.261498.451. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: LU.C32 vCPUs8 vCPUs16 vCPUs20K40K60K80K100KSE +/- 137.48, N = 3SE +/- 50.76, N = 3SE +/- 701.24, N = 387702.3032029.1455447.311. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: MG.C32 vCPUs8 vCPUs16 vCPUs11K22K33K44K55KSE +/- 31.40, N = 3SE +/- 46.23, N = 3SE +/- 102.49, N = 350939.0527703.3333309.761. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: SP.B32 vCPUs8 vCPUs16 vCPUs7K14K21K28K35KSE +/- 38.20, N = 3SE +/- 17.11, N = 3SE +/- 244.58, N = 334381.917338.9819552.451. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: SP.C32 vCPUs8 vCPUs16 vCPUs6K12K18K24K30KSE +/- 31.60, N = 3SE +/- 39.91, N = 3SE +/- 112.56, N = 326843.587115.2819710.901. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

PostgreSQL pgbench

This is a benchmark of PostgreSQL using pgbench for facilitating the database benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 14.0Scaling Factor: 100 - Clients: 100 - Mode: Read Only32 vCPUs8 vCPUs16 vCPUs70K140K210K280K350KSE +/- 1811.74, N = 3SE +/- 663.61, N = 3SE +/- 697.61, N = 3329539542371578941. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lm

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 14.0Scaling Factor: 100 - Clients: 250 - Mode: Read Only32 vCPUs8 vCPUs16 vCPUs70K140K210K280K350KSE +/- 4561.68, N = 12SE +/- 588.20, N = 12SE +/- 1418.81, N = 3312239496281316071. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lm

OpenSSL

OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgverify/s, More Is BetterOpenSSL 3.0Algorithm: RSA409632 vCPUs8 vCPUs16 vCPUs30K60K90K120K150KSE +/- 29.86, N = 3SE +/- 10.26, N = 3SE +/- 8.35, N = 3128273.132136.864247.01. (CC) gcc options: -pthread -O3 -march=native -lssl -lcrypto -ldl

TensorFlow Lite

This is a benchmark of the TensorFlow Lite implementation focused on TensorFlow machine learning for mobile, IoT, edge, and other cases. The current Linux support is limited to running on CPUs. This test profile is measuring the average inference time. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: SqueezeNet32 vCPUs8 vCPUs16 vCPUs14002800420056007000SE +/- 31.57, N = 8SE +/- 9.96, N = 3SE +/- 11.05, N = 33853.906618.323955.89

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Inception V432 vCPUs8 vCPUs16 vCPUs20K40K60K80K100KSE +/- 149.01, N = 3SE +/- 49.87, N = 3SE +/- 84.24, N = 331657.397646.146113.4

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Mobilenet Float32 vCPUs8 vCPUs16 vCPUs9001800270036004500SE +/- 17.55, N = 3SE +/- 3.06, N = 3SE +/- 4.32, N = 32093.254395.732481.96

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Inception ResNet V232 vCPUs8 vCPUs16 vCPUs20K40K60K80K100KSE +/- 379.42, N = 3SE +/- 70.02, N = 3SE +/- 16.18, N = 333994.991592.645445.9

Renaissance

Renaissance is a suite of benchmarks designed to test the Java JVM from Apache Spark to a Twitter-like service to Scala and other features. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterRenaissance 0.14Test: Apache Spark Bayes32 vCPUs8 vCPUs16 vCPUs5001000150020002500SE +/- 9.73, N = 3SE +/- 42.77, N = 15SE +/- 7.45, N = 3766.42249.81262.0MIN: 495.95 / MAX: 1178.88MIN: 1478.18 / MAX: 2434.18MIN: 877.37 / MAX: 1398.23

OpenBenchmarking.orgms, Fewer Is BetterRenaissance 0.14Test: Savina Reactors.IO32 vCPUs8 vCPUs16 vCPUs6K12K18K24K30KSE +/- 131.70, N = 4SE +/- 435.60, N = 9SE +/- 583.25, N = 1210705.926456.415981.5MIN: 10505.49 / MAX: 14847.21MIN: 13667.16 / MAX: 42318.14MIN: 12776.53 / MAX: 36273.51

PostgreSQL pgbench

This is a benchmark of PostgreSQL using pgbench for facilitating the database benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL pgbench 14.0Scaling Factor: 100 - Clients: 100 - Mode: Read Only - Average Latency32 vCPUs8 vCPUs16 vCPUs0.41490.82981.24471.65962.0745SE +/- 0.002, N = 3SE +/- 0.023, N = 3SE +/- 0.003, N = 30.3041.8440.6331. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lm

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL pgbench 14.0Scaling Factor: 100 - Clients: 250 - Mode: Read Only - Average Latency32 vCPUs8 vCPUs16 vCPUs1.13512.27023.40534.54045.6755SE +/- 0.012, N = 12SE +/- 0.060, N = 12SE +/- 0.021, N = 30.8035.0451.9001. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lm

TNN

TNN is an open-source deep learning reasoning framework developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: DenseNet32 vCPUs8 vCPUs16 vCPUs8001600240032004000SE +/- 6.90, N = 3SE +/- 9.75, N = 3SE +/- 12.43, N = 33056.903842.123358.73MIN: 2928.19 / MAX: 3237.58MIN: 3619.38 / MAX: 4060.16MIN: 3163.2 / MAX: 3575.851. (CXX) g++ options: -O3 -march=native -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: MobileNet v232 vCPUs8 vCPUs16 vCPUs70140210280350SE +/- 0.05, N = 3SE +/- 0.78, N = 3SE +/- 1.36, N = 3322.77331.34328.89MIN: 319.63 / MAX: 326.43MIN: 327.36 / MAX: 339.94MIN: 322.15 / MAX: 373.81. (CXX) g++ options: -O3 -march=native -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

DaCapo Benchmark

This test runs the DaCapo Benchmarks written in Java and intended to test system/CPU performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgmsec, Fewer Is BetterDaCapo Benchmark 9.12-MR1Java Test: Tradesoap32 vCPUs8 vCPUs16 vCPUs2K4K6K8K10KSE +/- 95.95, N = 20SE +/- 68.63, N = 4SE +/- 52.52, N = 4501580405598

OpenFOAM

OpenFOAM is the leading free, open-source software for computational fluid dynamics (CFD). This test profile currently uses the drivaerFastback test case for analyzing automotive aerodynamics or alternatively the older motorBike input. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 9Input: drivaerFastback, Medium Mesh Size - Mesh Time32 vCPUs8 vCPUs16 vCPUs90180270360450206.40425.95303.71-lfoamToVTK -ldynamicMesh -llagrangian -lfileFormats-ltransportModels -lspecie -lfiniteVolume -lfvModels -lmeshTools -lsampling-lfoamToVTK -ldynamicMesh -llagrangian -lfileFormats1. (CXX) g++ options: -std=c++14 -O3 -mcpu=native -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lgenericPatchFields -lOpenFOAM -ldl -lm

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 9Input: drivaerFastback, Medium Mesh Size - Execution Time32 vCPUs8 vCPUs16 vCPUs5001000150020002500994.532426.161534.72-lfoamToVTK -ldynamicMesh -llagrangian -lfileFormats-ltransportModels -lspecie -lfiniteVolume -lfvModels -lmeshTools -lsampling-lfoamToVTK -ldynamicMesh -llagrangian -lfileFormats1. (CXX) g++ options: -std=c++14 -O3 -mcpu=native -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lgenericPatchFields -lOpenFOAM -ldl -lm

libavif avifenc

This is a test of the AOMedia libavif library testing the encoding of a JPEG image to AV1 Image Format (AVIF). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.10Encoder Speed: 032 vCPUs8 vCPUs16 vCPUs100200300400500SE +/- 0.65, N = 3SE +/- 0.90, N = 3SE +/- 0.80, N = 3266.34456.24328.971. (CXX) g++ options: -O3 -fPIC -march=native -lm

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.10Encoder Speed: 232 vCPUs8 vCPUs16 vCPUs50100150200250SE +/- 0.13, N = 3SE +/- 0.32, N = 3SE +/- 0.47, N = 3169.64245.60194.771. (CXX) g++ options: -O3 -fPIC -march=native -lm

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.10Encoder Speed: 632 vCPUs8 vCPUs16 vCPUs510152025SE +/- 0.020, N = 3SE +/- 0.130, N = 3SE +/- 0.037, N = 36.68220.27311.1681. (CXX) g++ options: -O3 -fPIC -march=native -lm

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.10Encoder Speed: 6, Lossless32 vCPUs8 vCPUs16 vCPUs612182430SE +/- 0.00, N = 3SE +/- 0.11, N = 3SE +/- 0.19, N = 310.3423.3114.701. (CXX) g++ options: -O3 -fPIC -march=native -lm

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.10Encoder Speed: 10, Lossless32 vCPUs8 vCPUs16 vCPUs3691215SE +/- 0.072, N = 3SE +/- 0.096, N = 3SE +/- 0.065, N = 86.7759.9977.6581. (CXX) g++ options: -O3 -fPIC -march=native -lm

Timed FFmpeg Compilation

This test times how long it takes to build the FFmpeg multimedia library. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed FFmpeg Compilation 4.4Time To Compile32 vCPUs8 vCPUs16 vCPUs306090120150SE +/- 0.16, N = 3SE +/- 0.18, N = 3SE +/- 0.09, N = 338.96113.4661.98

Timed Gem5 Compilation

This test times how long it takes to compile Gem5. Gem5 is a simulator for computer system architecture research. Gem5 is widely used for computer architecture research within the industry, academia, and more. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Gem5 Compilation 21.2Time To Compile32 vCPUs8 vCPUs16 vCPUs2004006008001000SE +/- 1.96, N = 3SE +/- 0.66, N = 3SE +/- 1.16, N = 3312.12917.39495.54

Timed MPlayer Compilation

This test times how long it takes to build the MPlayer open-source media player program. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed MPlayer Compilation 1.5Time To Compile32 vCPUs8 vCPUs16 vCPUs20406080100SE +/- 0.35, N = 4SE +/- 0.03, N = 3SE +/- 0.26, N = 328.9388.8447.62

Apache Spark

This is a benchmark of Apache Spark with its PySpark interface. Apache Spark is an open-source unified analytics engine for large-scale data processing and dealing with big data. This test profile benchmars the Apache Spark in a single-system configuration using spark-submit. The test makes use of DIYBigData's pyspark-benchmark (https://github.com/DIYBigData/pyspark-benchmark/) for generating of test data and various Apache Spark operations. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 100 - SHA-512 Benchmark Time32 vCPUs8 vCPUs16 vCPUs246810SE +/- 0.11, N = 15SE +/- 0.03, N = 3SE +/- 0.05, N = 124.796.284.93

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 100 - Calculate Pi Benchmark32 vCPUs8 vCPUs16 vCPUs60120180240300SE +/- 0.06, N = 15SE +/- 0.17, N = 3SE +/- 0.11, N = 1269.77277.89137.76

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 100 - Calculate Pi Benchmark Using Dataframe32 vCPUs8 vCPUs16 vCPUs48121620SE +/- 0.01, N = 15SE +/- 0.02, N = 3SE +/- 0.02, N = 124.7915.928.39

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 100 - Repartition Test Time32 vCPUs8 vCPUs16 vCPUs1.03052.0613.09154.1225.1525SE +/- 0.03, N = 15SE +/- 0.02, N = 3SE +/- 0.03, N = 122.014.582.55

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 100 - Inner Join Test Time32 vCPUs8 vCPUs16 vCPUs0.7831.5662.3493.1323.915SE +/- 0.02, N = 15SE +/- 0.04, N = 3SE +/- 0.03, N = 122.133.482.22

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 100 - Broadcast Inner Join Test Time32 vCPUs8 vCPUs16 vCPUs0.64351.2871.93052.5743.2175SE +/- 0.03, N = 15SE +/- 0.03, N = 3SE +/- 0.04, N = 121.682.861.79

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 2000 - SHA-512 Benchmark Time32 vCPUs8 vCPUs16 vCPUs246810SE +/- 0.04, N = 15SE +/- 0.08, N = 3SE +/- 0.05, N = 34.968.095.91

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 2000 - Calculate Pi Benchmark32 vCPUs8 vCPUs16 vCPUs60120180240300SE +/- 0.06, N = 15SE +/- 0.35, N = 3SE +/- 0.20, N = 369.92278.47137.17

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 2000 - Calculate Pi Benchmark Using Dataframe32 vCPUs8 vCPUs16 vCPUs48121620SE +/- 0.01, N = 15SE +/- 0.07, N = 3SE +/- 0.03, N = 34.8015.818.29

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 2000 - Group By Test Time32 vCPUs8 vCPUs16 vCPUs246810SE +/- 0.05, N = 15SE +/- 0.13, N = 3SE +/- 0.10, N = 36.728.857.43

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 2000 - Repartition Test Time32 vCPUs8 vCPUs16 vCPUs1.27582.55163.82745.10326.379SE +/- 0.03, N = 15SE +/- 0.01, N = 3SE +/- 0.01, N = 32.605.673.36

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 2000 - Inner Join Test Time32 vCPUs8 vCPUs16 vCPUs1.34552.6914.03655.3826.7275SE +/- 0.04, N = 15SE +/- 0.09, N = 3SE +/- 0.11, N = 32.875.983.65

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 2000 - Broadcast Inner Join Test Time32 vCPUs8 vCPUs16 vCPUs1.16552.3313.49654.6625.8275SE +/- 0.02, N = 15SE +/- 0.07, N = 3SE +/- 0.05, N = 32.125.182.65

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 100 - SHA-512 Benchmark Time32 vCPUs8 vCPUs16 vCPUs20406080100SE +/- 0.45, N = 9SE +/- 0.85, N = 3SE +/- 0.22, N = 346.3093.6751.40

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 100 - Calculate Pi Benchmark32 vCPUs8 vCPUs16 vCPUs60120180240300SE +/- 0.08, N = 9SE +/- 0.14, N = 3SE +/- 0.06, N = 369.57278.37136.85

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 100 - Calculate Pi Benchmark Using Dataframe32 vCPUs8 vCPUs16 vCPUs48121620SE +/- 0.01, N = 9SE +/- 0.05, N = 3SE +/- 0.01, N = 34.7615.658.40

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 100 - Group By Test Time32 vCPUs8 vCPUs16 vCPUs1122334455SE +/- 0.16, N = 9SE +/- 0.53, N = 3SE +/- 0.24, N = 327.6450.8735.64

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 100 - Repartition Test Time32 vCPUs8 vCPUs16 vCPUs1530456075SE +/- 0.12, N = 9SE +/- 0.25, N = 3SE +/- 0.92, N = 324.3668.6837.22

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 100 - Inner Join Test Time32 vCPUs8 vCPUs16 vCPUs20406080100SE +/- 0.44, N = 9SE +/- 0.19, N = 3SE +/- 0.72, N = 330.3280.0244.62

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 100 - Broadcast Inner Join Test Time32 vCPUs8 vCPUs16 vCPUs20406080100SE +/- 0.26, N = 9SE +/- 0.22, N = 3SE +/- 0.44, N = 331.9880.2645.29

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 2000 - SHA-512 Benchmark Time32 vCPUs8 vCPUs16 vCPUs20406080100SE +/- 0.55, N = 12SE +/- 0.52, N = 3SE +/- 0.08, N = 339.2289.2351.55

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 2000 - Calculate Pi Benchmark32 vCPUs8 vCPUs16 vCPUs60120180240300SE +/- 0.11, N = 12SE +/- 0.13, N = 3SE +/- 0.17, N = 369.79277.83137.04

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 2000 - Calculate Pi Benchmark Using Dataframe32 vCPUs8 vCPUs16 vCPUs48121620SE +/- 0.02, N = 12SE +/- 0.00, N = 3SE +/- 0.03, N = 34.7815.718.34

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 2000 - Group By Test Time32 vCPUs8 vCPUs16 vCPUs1020304050SE +/- 0.32, N = 12SE +/- 0.57, N = 3SE +/- 0.23, N = 322.8445.6130.70

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 2000 - Repartition Test Time32 vCPUs8 vCPUs16 vCPUs1530456075SE +/- 0.24, N = 12SE +/- 0.36, N = 3SE +/- 0.20, N = 322.2266.2735.45

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 2000 - Inner Join Test Time32 vCPUs8 vCPUs16 vCPUs20406080100SE +/- 0.19, N = 12SE +/- 1.11, N = 3SE +/- 0.73, N = 328.6678.0044.39

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 2000 - Broadcast Inner Join Test Time32 vCPUs8 vCPUs16 vCPUs20406080100SE +/- 0.17, N = 12SE +/- 0.33, N = 3SE +/- 0.40, N = 326.5574.7142.75

ASTC Encoder

ASTC Encoder (astcenc) is for the Adaptive Scalable Texture Compression (ASTC) format commonly used with OpenGL, OpenGL ES, and Vulkan graphics APIs. This test profile does a coding test of both compression/decompression. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 3.2Preset: Medium32 vCPUs8 vCPUs16 vCPUs3691215SE +/- 0.0035, N = 3SE +/- 0.0253, N = 3SE +/- 0.0194, N = 35.98259.05056.94491. (CXX) g++ options: -O3 -march=native -flto -pthread

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 3.2Preset: Thorough32 vCPUs8 vCPUs16 vCPUs714212835SE +/- 0.0033, N = 3SE +/- 0.0316, N = 3SE +/- 0.0106, N = 37.161929.050514.21461. (CXX) g++ options: -O3 -march=native -flto -pthread

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 3.2Preset: Exhaustive32 vCPUs8 vCPUs16 vCPUs60120180240300SE +/- 0.08, N = 3SE +/- 3.08, N = 3SE +/- 0.04, N = 368.66276.88137.621. (CXX) g++ options: -O3 -march=native -flto -pthread

GPAW

GPAW is a density-functional theory (DFT) Python code based on the projector-augmented wave (PAW) method and the atomic simulation environment (ASE). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterGPAW 22.1Input: Carbon Nanotube32 vCPUs8 vCPUs16 vCPUs80160240320400SE +/- 0.30, N = 3SE +/- 0.63, N = 3SE +/- 0.03, N = 3130.35381.20208.971. (CC) gcc options: -shared -fwrapv -O2 -O3 -march=native -lxc -lblas -lmpi

Blender

Blender is an open-source 3D creation software project. This test is of Blender's Cycles benchmark with various sample files. GPU computing is supported. This system/blender test profile makes use of the system-supplied Blender. Use pts/blender if wishing to stick to a fixed version of Blender. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterBlenderBlend File: BMW27 - Compute: CPU-Only32 vCPUs8 vCPUs16 vCPUs100200300400500SE +/- 0.10, N = 3SE +/- 0.04, N = 3SE +/- 0.50, N = 3112.47447.71226.26

OpenBenchmarking.orgSeconds, Fewer Is BetterBlenderBlend File: Classroom - Compute: CPU-Only32 vCPUs8 vCPUs16 vCPUs2004006008001000SE +/- 0.07, N = 3SE +/- 1.99, N = 3SE +/- 0.22, N = 3249.891016.66506.10

OpenBenchmarking.orgSeconds, Fewer Is BetterBlenderBlend File: Fishy Cat - Compute: CPU-Only32 vCPUs8 vCPUs16 vCPUs2004006008001000SE +/- 0.42, N = 3SE +/- 1.74, N = 3SE +/- 0.85, N = 3214.41841.18426.04

102 Results Shown

Graph500:
  26:
    bfs max_TEPS
    bfs median_TEPS
Stress-NG:
  Futex
  CPU Cache
  CPU Stress
  Matrix Math
  Vector Math
  System V Message Passing
OpenSSL
Sysbench
VP9 libvpx Encoding:
  Speed 5 - Bosphorus 4K
  Speed 0 - Bosphorus 1080p
  Speed 5 - Bosphorus 1080p
High Performance Conjugate Gradient
ASKAP
Coremark
SPECjbb 2015:
  SPECjbb2015-Composite max-jOPS
  SPECjbb2015-Composite critical-jOPS
Aircrack-ng
ASKAP:
  tConvolve MT - Gridding
  tConvolve MT - Degridding
  tConvolve OpenMP - Gridding
  tConvolve OpenMP - Degridding
  tConvolve MPI - Degridding
  tConvolve MPI - Gridding
GROMACS
LAMMPS Molecular Dynamics Simulator:
  20k Atoms
  Rhodopsin Protein
Apache Cassandra
Facebook RocksDB:
  Rand Read
  Read While Writing
  Read Rand Write Rand
OpenSSL
Graph500:
    sssp max_TEPS
    sssp median_TEPS
NAS Parallel Benchmarks:
  BT.C
  CG.C
  EP.D
  FT.C
  IS.D
  LU.C
  MG.C
  SP.B
  SP.C
PostgreSQL pgbench:
  100 - 100 - Read Only
  100 - 250 - Read Only
OpenSSL
TensorFlow Lite:
  SqueezeNet
  Inception V4
  Mobilenet Float
  Inception ResNet V2
Renaissance:
  Apache Spark Bayes
  Savina Reactors.IO
PostgreSQL pgbench:
  100 - 100 - Read Only - Average Latency
  100 - 250 - Read Only - Average Latency
TNN:
  CPU - DenseNet
  CPU - MobileNet v2
DaCapo Benchmark
OpenFOAM:
  drivaerFastback, Medium Mesh Size - Mesh Time
  drivaerFastback, Medium Mesh Size - Execution Time
libavif avifenc:
  0
  2
  6
  6, Lossless
  10, Lossless
Timed FFmpeg Compilation
Timed Gem5 Compilation
Timed MPlayer Compilation
Apache Spark:
  1000000 - 100 - SHA-512 Benchmark Time
  1000000 - 100 - Calculate Pi Benchmark
  1000000 - 100 - Calculate Pi Benchmark Using Dataframe
  1000000 - 100 - Repartition Test Time
  1000000 - 100 - Inner Join Test Time
  1000000 - 100 - Broadcast Inner Join Test Time
  1000000 - 2000 - SHA-512 Benchmark Time
  1000000 - 2000 - Calculate Pi Benchmark
  1000000 - 2000 - Calculate Pi Benchmark Using Dataframe
  1000000 - 2000 - Group By Test Time
  1000000 - 2000 - Repartition Test Time
  1000000 - 2000 - Inner Join Test Time
  1000000 - 2000 - Broadcast Inner Join Test Time
  40000000 - 100 - SHA-512 Benchmark Time
  40000000 - 100 - Calculate Pi Benchmark
  40000000 - 100 - Calculate Pi Benchmark Using Dataframe
  40000000 - 100 - Group By Test Time
  40000000 - 100 - Repartition Test Time
  40000000 - 100 - Inner Join Test Time
  40000000 - 100 - Broadcast Inner Join Test Time
  40000000 - 2000 - SHA-512 Benchmark Time
  40000000 - 2000 - Calculate Pi Benchmark
  40000000 - 2000 - Calculate Pi Benchmark Using Dataframe
  40000000 - 2000 - Group By Test Time
  40000000 - 2000 - Repartition Test Time
  40000000 - 2000 - Inner Join Test Time
  40000000 - 2000 - Broadcast Inner Join Test Time
ASTC Encoder:
  Medium
  Thorough
  Exhaustive
GPAW
Blender:
  BMW27 - CPU-Only
  Classroom - CPU-Only
  Fishy Cat - CPU-Only