Tau T2A 8 16 32 vCPU Scaling

Benchmarks by Michael Larabel for a future article

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2208120-PTS-2208123N58
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results
Show Result Confidence Charts
Allow Limiting Results To Certain Suite(s)

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Prefer Vertical Bar Graphs
On Line Graphs With Missing Data, Connect The Line Gaps

Multi-Way Comparison

Condense Comparison
Transpose Comparison

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Toggle/Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
Tau T2A: 8 vCPUs
August 10 2022
  1 Day, 5 Hours, 56 Minutes
Tau T2A: 16 vCPUs
August 10 2022
  1 Day, 1 Hour, 45 Minutes
Tau T2A: 32 vCPUs
August 11 2022
  1 Day, 6 Hours, 6 Minutes
Invert Behavior (Only Show Selected Data)
  1 Day, 4 Hours, 35 Minutes

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


ProcessorMotherboardMemoryDiskNetworkOSKernelCompilerFile-SystemSystem LayerTau T2A 8 vCPUs 16 vCPUs 32 vCPUsARMv8 Neoverse-N1 (8 Cores)KVM Google Compute Engine32GB215GB nvme_card-pdGoogle Compute Engine VirtualUbuntu 22.045.15.0-1013-gcp (aarch64)GCC 12.0.1 20220319ext4KVMARMv8 Neoverse-N1 (16 Cores)64GBARMv8 Neoverse-N1 (32 Cores)128GB5.15.0-1016-gcp (aarch64)OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseEnvironment Details- CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Compiler Details- --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v Java Details- OpenJDK Runtime Environment (build 11.0.16+8-post-Ubuntu-0ubuntu122.04)Python Details- Python 3.10.4Security Details- Tau T2A: 8 vCPUs: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected - Tau T2A: 16 vCPUs: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected - Tau T2A: 32 vCPUs: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected

stress-ng: Futexstress-ng: CPU Cachestress-ng: CPU Stressstress-ng: Matrix Mathstress-ng: Vector Mathstress-ng: System V Message Passingspec-jbb2015: SPECjbb2015-Composite max-jOPSspec-jbb2015: SPECjbb2015-Composite critical-jOPSdacapobench: Tradesoaprenaissance: Apache Spark Bayesrenaissance: Savina Reactors.IOastcenc: Mediumastcenc: Thoroughastcenc: Exhaustivegraph500: 26graph500: 26graph500: 26graph500: 26tensorflow-lite: SqueezeNettensorflow-lite: Inception V4tensorflow-lite: Mobilenet Floattensorflow-lite: Inception ResNet V2tnn: CPU - DenseNettnn: CPU - MobileNet v2gromacs: MPI CPU - water_GMX50_barelammps: 20k Atomslammps: Rhodopsin Proteinhpcg: npb: BT.Cnpb: CG.Cnpb: EP.Dnpb: FT.Cnpb: IS.Dnpb: LU.Cnpb: MG.Cnpb: SP.Bnpb: SP.Caskap: tConvolve MT - Griddingaskap: tConvolve MT - Degriddingaskap: tConvolve MPI - Degriddingaskap: tConvolve MPI - Griddingaskap: tConvolve OpenMP - Griddingaskap: tConvolve OpenMP - Degriddingaskap: Hogbom Clean OpenMPopenfoam: drivaerFastback, Medium Mesh Size - Mesh Timeopenfoam: drivaerFastback, Medium Mesh Size - Execution Timegpaw: Carbon Nanotubecoremark: CoreMark Size 666 - Iterations Per Secondaircrack-ng: build-ffmpeg: Time To Compilebuild-mplayer: Time To Compilesysbench: CPUvpxenc: Speed 5 - Bosphorus 4Kvpxenc: Speed 0 - Bosphorus 1080pvpxenc: Speed 5 - Bosphorus 1080pavifenc: 0avifenc: 2avifenc: 6avifenc: 6, Losslessavifenc: 10, Losslessbuild-gem5: Time To Compileopenssl: SHA256openssl: RSA4096openssl: RSA4096spark: 1000000 - 100 - SHA-512 Benchmark Timespark: 1000000 - 100 - Calculate Pi Benchmarkspark: 1000000 - 100 - Calculate Pi Benchmark Using Dataframespark: 1000000 - 100 - Repartition Test Timespark: 1000000 - 100 - Inner Join Test Timespark: 1000000 - 100 - Broadcast Inner Join Test Timespark: 1000000 - 2000 - SHA-512 Benchmark Timespark: 1000000 - 2000 - Calculate Pi Benchmarkspark: 1000000 - 2000 - Calculate Pi Benchmark Using Dataframespark: 1000000 - 2000 - Group By Test Timespark: 1000000 - 2000 - Repartition Test Timespark: 1000000 - 2000 - Inner Join Test Timespark: 1000000 - 2000 - Broadcast Inner Join Test Timespark: 40000000 - 100 - SHA-512 Benchmark Timespark: 40000000 - 100 - Calculate Pi Benchmarkspark: 40000000 - 100 - Calculate Pi Benchmark Using Dataframespark: 40000000 - 100 - Group By Test Timespark: 40000000 - 100 - Repartition Test Timespark: 40000000 - 100 - Inner Join Test Timespark: 40000000 - 100 - Broadcast Inner Join Test Timespark: 40000000 - 2000 - SHA-512 Benchmark Timespark: 40000000 - 2000 - Calculate Pi Benchmarkspark: 40000000 - 2000 - Calculate Pi Benchmark Using Dataframespark: 40000000 - 2000 - Group By Test Timespark: 40000000 - 2000 - Repartition Test Timespark: 40000000 - 2000 - Inner Join Test Timespark: 40000000 - 2000 - Broadcast Inner Join Test Timerocksdb: Rand Readrocksdb: Read While Writingrocksdb: Read Rand Write Randcassandra: Writespgbench: 100 - 100 - Read Onlypgbench: 100 - 100 - Read Only - Average Latencyblender: Fishy Cat - CPU-Onlyblender: Classroom - CPU-Onlyblender: BMW27 - CPU-Onlypgbench: 100 - 250 - Read Onlypgbench: 100 - 250 - Read Only - Average LatencyTau T2A 8 vCPUs 16 vCPUs 32 vCPUs937451.99436.312065.5338215.9524633.994507844.179158392180402249.826456.49.050529.0505276.87986618.3297646.14395.7391592.63842.123331.3440.454.6624.81211.096914368.296855.81820.9418574.231104.2632029.1427703.337338.987115.282360.632196.741325.061977.892296.102421.43371.295425.952426.16381.196175037.7651438308.580113.45588.83827237.286.114.6511.27456.237245.60120.27323.3129.997917.3936456083507393.732136.86.28277.8915.924.583.482.868.09278.47252722915.818.855.675.985.1893.67278.36684567715.6550.8768.6880.0280.2689.23277.8315.7145.6166.2778.0074.713105568959470254835317862542371.844841.181016.66447.71496285.0451198681.87551.254116.9676177.5649102.305475267.3618092920755981262.015981.56.944914.2146137.621825747800026256300070750200952655003955.8946113.42481.9645445.93358.728328.8860.8808.4998.86117.098449125.9312171.951634.9932644.851498.4555447.3133309.7619552.4519710.903789.014083.162585.313343.253631.815023.7645.161303.711534.72208.965351562.54315816697.92461.98347.62154317.426.684.8411.78328.970194.77411.16814.7027.658495.54112926411527786.764247.04.93137.7612851908.392.552.221.795.91137.1739761998.297.433.363.652.6551.40136.8496130198.4035.6437.2244.6245.2951.55137.0402077248.3430.7035.4544.3942.75620489671264826884700392961578940.633426.04506.10226.261316071.9001437660.62566.918209.47151792.8397749.086128517.1035075229555015766.410705.95.98257.161968.65574773770005083720001247020001695420003853.9031657.32093.2533994.93056.897322.7681.71816.55016.59622.093069530.6421433.923265.6852309.811822.7787702.3050939.0534381.9126843.584456.555522.073962.083899.287262.749181.24996.700206.4994.53130.353700917.94473733647.54838.95828.928108241.616.994.9912.11266.337169.6396.68210.3416.775312.120257889199131570.2128273.14.7969.774.792.012.131.684.9669.924.806.722.602.872.1246.3069.574.7627.6424.3630.3231.9839.2269.794.7822.8422.2228.6626.5512470420126109921321827878193295390.304214.41249.89112.473122390.803OpenBenchmarking.org

Stress-NG

Stress-NG is a Linux stress tool developed by Colin King of Canonical. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.14Test: Futex16 vCPUs32 vCPUs8 vCPUs300K600K900K1200K1500KSE +/- 30917.20, N = 15SE +/- 15026.23, N = 3SE +/- 36001.77, N = 151198681.871437660.62937451.991. (CC) gcc options: -O3 -march=native -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.14Test: CPU Cache16 vCPUs32 vCPUs8 vCPUs120240360480600SE +/- 2.05, N = 3SE +/- 0.28, N = 3SE +/- 2.30, N = 3551.25566.91436.311. (CC) gcc options: -O3 -march=native -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.14Test: CPU Stress16 vCPUs32 vCPUs8 vCPUs2K4K6K8K10KSE +/- 2.80, N = 3SE +/- 4.23, N = 3SE +/- 1.28, N = 34116.968209.472065.531. (CC) gcc options: -O3 -march=native -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.14Test: Matrix Math16 vCPUs32 vCPUs8 vCPUs30K60K90K120K150KSE +/- 10.44, N = 3SE +/- 9.80, N = 3SE +/- 25.04, N = 376177.56151792.8338215.951. (CC) gcc options: -O3 -march=native -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.14Test: Vector Math16 vCPUs32 vCPUs8 vCPUs20K40K60K80K100KSE +/- 27.43, N = 3SE +/- 190.70, N = 3SE +/- 6.31, N = 349102.3097749.0824633.991. (CC) gcc options: -O3 -march=native -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.14Test: System V Message Passing16 vCPUs32 vCPUs8 vCPUs1.3M2.6M3.9M5.2M6.5MSE +/- 15929.45, N = 3SE +/- 7551.56, N = 3SE +/- 12538.93, N = 35475267.366128517.104507844.171. (CC) gcc options: -O3 -march=native -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread

SPECjbb 2015

This is a benchmark of SPECjbb 2015. For this test profile to work, you must have a valid license/copy of the SPECjbb 2015 ISO (SPECjbb2015-1.02.iso) in your Phoronix Test Suite download cache. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgjOPS, More Is BetterSPECjbb 2015SPECjbb2015-Composite max-jOPS16 vCPUs32 vCPUs8 vCPUs8K16K24K32K40K18092350759158

OpenBenchmarking.orgjOPS, More Is BetterSPECjbb 2015SPECjbb2015-Composite critical-jOPS16 vCPUs32 vCPUs8 vCPUs5K10K15K20K25K9207229553921

DaCapo Benchmark

This test runs the DaCapo Benchmarks written in Java and intended to test system/CPU performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgmsec, Fewer Is BetterDaCapo Benchmark 9.12-MR1Java Test: Tradesoap16 vCPUs32 vCPUs8 vCPUs2K4K6K8K10KSE +/- 52.52, N = 4SE +/- 95.95, N = 20SE +/- 68.63, N = 4559850158040

Renaissance

Renaissance is a suite of benchmarks designed to test the Java JVM from Apache Spark to a Twitter-like service to Scala and other features. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterRenaissance 0.14Test: Apache Spark Bayes16 vCPUs32 vCPUs8 vCPUs5001000150020002500SE +/- 7.45, N = 3SE +/- 9.73, N = 3SE +/- 42.77, N = 151262.0766.42249.8MIN: 877.37 / MAX: 1398.23MIN: 495.95 / MAX: 1178.88MIN: 1478.18 / MAX: 2434.18

OpenBenchmarking.orgms, Fewer Is BetterRenaissance 0.14Test: Savina Reactors.IO16 vCPUs32 vCPUs8 vCPUs6K12K18K24K30KSE +/- 583.25, N = 12SE +/- 131.70, N = 4SE +/- 435.60, N = 915981.510705.926456.4MIN: 12776.53 / MAX: 36273.51MIN: 10505.49 / MAX: 14847.21MIN: 13667.16 / MAX: 42318.14

ASTC Encoder

ASTC Encoder (astcenc) is for the Adaptive Scalable Texture Compression (ASTC) format commonly used with OpenGL, OpenGL ES, and Vulkan graphics APIs. This test profile does a coding test of both compression/decompression. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 3.2Preset: Medium16 vCPUs32 vCPUs8 vCPUs3691215SE +/- 0.0194, N = 3SE +/- 0.0035, N = 3SE +/- 0.0253, N = 36.94495.98259.05051. (CXX) g++ options: -O3 -march=native -flto -pthread

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 3.2Preset: Thorough16 vCPUs32 vCPUs8 vCPUs714212835SE +/- 0.0106, N = 3SE +/- 0.0033, N = 3SE +/- 0.0316, N = 314.21467.161929.05051. (CXX) g++ options: -O3 -march=native -flto -pthread

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 3.2Preset: Exhaustive16 vCPUs32 vCPUs8 vCPUs60120180240300SE +/- 0.04, N = 3SE +/- 0.08, N = 3SE +/- 3.08, N = 3137.6268.66276.881. (CXX) g++ options: -O3 -march=native -flto -pthread

Graph500

This is a benchmark of the reference implementation of Graph500, an HPC benchmark focused on data intensive loads and commonly tested on supercomputers for complex data problems. Graph500 primarily stresses the communication subsystem of the hardware under test. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgbfs median_TEPS, More Is BetterGraph500 3.0Scale: 2616 vCPUs32 vCPUs100M200M300M400M500M2574780004773770001. (CC) gcc options: -fcommon -O3 -march=native -lpthread -lm -lmpi

Scale: 26

Tau T2A: 8 vCPUs: The test quit with a non-zero exit status. E: mpirun noticed that process rank 2 with PID 0 on node instance-2 exited on signal 9 (Killed).

OpenBenchmarking.orgbfs max_TEPS, More Is BetterGraph500 3.0Scale: 2616 vCPUs32 vCPUs110M220M330M440M550M2625630005083720001. (CC) gcc options: -fcommon -O3 -march=native -lpthread -lm -lmpi

Scale: 26

Tau T2A: 8 vCPUs: The test quit with a non-zero exit status. E: mpirun noticed that process rank 2 with PID 0 on node instance-2 exited on signal 9 (Killed).

OpenBenchmarking.orgsssp median_TEPS, More Is BetterGraph500 3.0Scale: 2616 vCPUs32 vCPUs30M60M90M120M150M707502001247020001. (CC) gcc options: -fcommon -O3 -march=native -lpthread -lm -lmpi

Scale: 26

Tau T2A: 8 vCPUs: The test quit with a non-zero exit status. E: mpirun noticed that process rank 2 with PID 0 on node instance-2 exited on signal 9 (Killed).

OpenBenchmarking.orgsssp max_TEPS, More Is BetterGraph500 3.0Scale: 2616 vCPUs32 vCPUs40M80M120M160M200M952655001695420001. (CC) gcc options: -fcommon -O3 -march=native -lpthread -lm -lmpi

Scale: 26

Tau T2A: 8 vCPUs: The test quit with a non-zero exit status. E: mpirun noticed that process rank 2 with PID 0 on node instance-2 exited on signal 9 (Killed).

TensorFlow Lite

This is a benchmark of the TensorFlow Lite implementation focused on TensorFlow machine learning for mobile, IoT, edge, and other cases. The current Linux support is limited to running on CPUs. This test profile is measuring the average inference time. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: SqueezeNet16 vCPUs32 vCPUs8 vCPUs14002800420056007000SE +/- 11.05, N = 3SE +/- 31.57, N = 8SE +/- 9.96, N = 33955.893853.906618.32

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Inception V416 vCPUs32 vCPUs8 vCPUs20K40K60K80K100KSE +/- 84.24, N = 3SE +/- 149.01, N = 3SE +/- 49.87, N = 346113.431657.397646.1

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Mobilenet Float16 vCPUs32 vCPUs8 vCPUs9001800270036004500SE +/- 4.32, N = 3SE +/- 17.55, N = 3SE +/- 3.06, N = 32481.962093.254395.73

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Inception ResNet V216 vCPUs32 vCPUs8 vCPUs20K40K60K80K100KSE +/- 16.18, N = 3SE +/- 379.42, N = 3SE +/- 70.02, N = 345445.933994.991592.6

TNN

TNN is an open-source deep learning reasoning framework developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: DenseNet16 vCPUs32 vCPUs8 vCPUs8001600240032004000SE +/- 12.43, N = 3SE +/- 6.90, N = 3SE +/- 9.75, N = 33358.733056.903842.12MIN: 3163.2 / MAX: 3575.85MIN: 2928.19 / MAX: 3237.58MIN: 3619.38 / MAX: 4060.161. (CXX) g++ options: -O3 -march=native -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: MobileNet v216 vCPUs32 vCPUs8 vCPUs70140210280350SE +/- 1.36, N = 3SE +/- 0.05, N = 3SE +/- 0.78, N = 3328.89322.77331.34MIN: 322.15 / MAX: 373.8MIN: 319.63 / MAX: 326.43MIN: 327.36 / MAX: 339.941. (CXX) g++ options: -O3 -march=native -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

GROMACS

The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2022.1Implementation: MPI CPU - Input: water_GMX50_bare16 vCPUs32 vCPUs8 vCPUs0.38660.77321.15981.54641.933SE +/- 0.001, N = 3SE +/- 0.010, N = 3SE +/- 0.000, N = 30.8801.7180.4501. (CXX) g++ options: -O3 -march=native

LAMMPS Molecular Dynamics Simulator

LAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 23Jun2022Model: 20k Atoms16 vCPUs32 vCPUs8 vCPUs48121620SE +/- 0.112, N = 3SE +/- 0.004, N = 3SE +/- 0.025, N = 38.49916.5504.6621. (CXX) g++ options: -O3 -march=native -ldl

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 23Jun2022Model: Rhodopsin Protein16 vCPUs32 vCPUs8 vCPUs48121620SE +/- 0.037, N = 3SE +/- 0.012, N = 3SE +/- 0.011, N = 38.86116.5964.8121. (CXX) g++ options: -O3 -march=native -ldl

High Performance Conjugate Gradient

HPCG is the High Performance Conjugate Gradient and is a new scientific benchmark from Sandia National Lans focused for super-computer testing with modern real-world workloads compared to HPCC. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.116 vCPUs32 vCPUs8 vCPUs510152025SE +/- 0.03, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 317.1022.0911.101. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

NAS Parallel Benchmarks

NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: BT.C16 vCPUs32 vCPUs8 vCPUs15K30K45K60K75KSE +/- 18.18, N = 3SE +/- 272.46, N = 3SE +/- 23.11, N = 349125.9369530.6414368.291. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: CG.C16 vCPUs32 vCPUs8 vCPUs5K10K15K20K25KSE +/- 171.15, N = 3SE +/- 35.67, N = 3SE +/- 49.63, N = 1512171.9521433.926855.811. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: EP.D16 vCPUs32 vCPUs8 vCPUs7001400210028003500SE +/- 1.03, N = 3SE +/- 2.04, N = 3SE +/- 0.56, N = 31634.993265.68820.941. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: FT.C16 vCPUs32 vCPUs8 vCPUs11K22K33K44K55KSE +/- 300.01, N = 3SE +/- 41.18, N = 3SE +/- 15.96, N = 332644.8552309.8118574.231. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: IS.D16 vCPUs32 vCPUs8 vCPUs400800120016002000SE +/- 14.70, N = 3SE +/- 0.86, N = 3SE +/- 1.14, N = 31498.451822.771104.261. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: LU.C16 vCPUs32 vCPUs8 vCPUs20K40K60K80K100KSE +/- 701.24, N = 3SE +/- 137.48, N = 3SE +/- 50.76, N = 355447.3187702.3032029.141. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: MG.C16 vCPUs32 vCPUs8 vCPUs11K22K33K44K55KSE +/- 102.49, N = 3SE +/- 31.40, N = 3SE +/- 46.23, N = 333309.7650939.0527703.331. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: SP.B16 vCPUs32 vCPUs8 vCPUs7K14K21K28K35KSE +/- 244.58, N = 3SE +/- 38.20, N = 3SE +/- 17.11, N = 319552.4534381.917338.981. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: SP.C16 vCPUs32 vCPUs8 vCPUs6K12K18K24K30KSE +/- 112.56, N = 3SE +/- 31.60, N = 3SE +/- 39.91, N = 319710.9026843.587115.281. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

ASKAP

ASKAP is a set of benchmarks from the Australian SKA Pathfinder. The principal ASKAP benchmarks are the Hogbom Clean Benchmark (tHogbomClean) and Convolutional Resamping Benchmark (tConvolve) as well as some previous ASKAP benchmarks being included as well for OpenCL and CUDA execution of tConvolve. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve MT - Gridding16 vCPUs32 vCPUs8 vCPUs10002000300040005000SE +/- 5.95, N = 3SE +/- 35.89, N = 15SE +/- 5.73, N = 33789.014456.552360.631. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve MT - Degridding16 vCPUs32 vCPUs8 vCPUs12002400360048006000SE +/- 2.61, N = 3SE +/- 80.56, N = 15SE +/- 7.87, N = 34083.165522.072196.741. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

OpenBenchmarking.orgMpix/sec, More Is BetterASKAP 1.0Test: tConvolve MPI - Degridding16 vCPUs32 vCPUs8 vCPUs8001600240032004000SE +/- 12.80, N = 3SE +/- 54.84, N = 15SE +/- 24.91, N = 152585.313962.081325.061. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

OpenBenchmarking.orgMpix/sec, More Is BetterASKAP 1.0Test: tConvolve MPI - Gridding16 vCPUs32 vCPUs8 vCPUs8001600240032004000SE +/- 32.26, N = 3SE +/- 42.99, N = 15SE +/- 23.42, N = 153343.253899.281977.891. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve OpenMP - Gridding16 vCPUs32 vCPUs8 vCPUs16003200480064008000SE +/- 43.40, N = 3SE +/- 66.63, N = 3SE +/- 29.91, N = 33631.817262.742296.101. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve OpenMP - Degridding16 vCPUs32 vCPUs8 vCPUs2K4K6K8K10KSE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 33.24, N = 35023.709181.242421.431. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

OpenBenchmarking.orgIterations Per Second, More Is BetterASKAP 1.0Test: Hogbom Clean OpenMP16 vCPUs32 vCPUs8 vCPUs2004006008001000SE +/- 0.00, N = 3SE +/- 3.30, N = 3SE +/- 1.21, N = 3645.16996.70371.301. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

OpenFOAM

OpenFOAM is the leading free, open-source software for computational fluid dynamics (CFD). This test profile currently uses the drivaerFastback test case for analyzing automotive aerodynamics or alternatively the older motorBike input. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 9Input: drivaerFastback, Medium Mesh Size - Mesh Time16 vCPUs32 vCPUs8 vCPUs90180270360450303.71206.40425.95-lfoamToVTK -ldynamicMesh -llagrangian -lfileFormats-lfoamToVTK -ldynamicMesh -llagrangian -lfileFormats-ltransportModels -lspecie -lfiniteVolume -lfvModels -lmeshTools -lsampling1. (CXX) g++ options: -std=c++14 -O3 -mcpu=native -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lgenericPatchFields -lOpenFOAM -ldl -lm

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 9Input: drivaerFastback, Medium Mesh Size - Execution Time16 vCPUs32 vCPUs8 vCPUs50010001500200025001534.72994.532426.16-lfoamToVTK -ldynamicMesh -llagrangian -lfileFormats-lfoamToVTK -ldynamicMesh -llagrangian -lfileFormats-ltransportModels -lspecie -lfiniteVolume -lfvModels -lmeshTools -lsampling1. (CXX) g++ options: -std=c++14 -O3 -mcpu=native -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lgenericPatchFields -lOpenFOAM -ldl -lm

GPAW

GPAW is a density-functional theory (DFT) Python code based on the projector-augmented wave (PAW) method and the atomic simulation environment (ASE). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterGPAW 22.1Input: Carbon Nanotube16 vCPUs32 vCPUs8 vCPUs80160240320400SE +/- 0.03, N = 3SE +/- 0.30, N = 3SE +/- 0.63, N = 3208.97130.35381.201. (CC) gcc options: -shared -fwrapv -O2 -O3 -march=native -lxc -lblas -lmpi

Coremark

This is a test of EEMBC CoreMark processor benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgIterations/Sec, More Is BetterCoremark 1.0CoreMark Size 666 - Iterations Per Second16 vCPUs32 vCPUs8 vCPUs150K300K450K600K750KSE +/- 87.09, N = 3SE +/- 385.56, N = 3SE +/- 85.46, N = 3351562.54700917.94175037.771. (CC) gcc options: -O2 -O3 -march=native -lrt" -lrt

Aircrack-ng

Aircrack-ng is a tool for assessing WiFi/WLAN network security. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgk/s, More Is BetterAircrack-ng 1.716 vCPUs32 vCPUs8 vCPUs7K14K21K28K35KSE +/- 192.97, N = 15SE +/- 287.54, N = 15SE +/- 103.85, N = 1516697.9233647.558308.58-lpcre-lpcre1. (CXX) g++ options: -std=gnu++17 -O3 -fvisibility=hidden -fcommon -rdynamic -lnl-3 -lnl-genl-3 -lpthread -lz -lssl -lcrypto -lhwloc -ldl -lm -pthread

Timed FFmpeg Compilation

This test times how long it takes to build the FFmpeg multimedia library. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed FFmpeg Compilation 4.4Time To Compile16 vCPUs32 vCPUs8 vCPUs306090120150SE +/- 0.09, N = 3SE +/- 0.16, N = 3SE +/- 0.18, N = 361.9838.96113.46

Timed MPlayer Compilation

This test times how long it takes to build the MPlayer open-source media player program. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed MPlayer Compilation 1.5Time To Compile16 vCPUs32 vCPUs8 vCPUs20406080100SE +/- 0.26, N = 3SE +/- 0.35, N = 4SE +/- 0.03, N = 347.6228.9388.84

Sysbench

This is a benchmark of Sysbench with the built-in CPU and memory sub-tests. Sysbench is a scriptable multi-threaded benchmark tool based on LuaJIT. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgEvents Per Second, More Is BetterSysbench 1.0.20Test: CPU16 vCPUs32 vCPUs8 vCPUs20K40K60K80K100KSE +/- 12.70, N = 3SE +/- 23.77, N = 3SE +/- 6.95, N = 354317.42108241.6127237.281. (CC) gcc options: -O2 -funroll-loops -O3 -march=native -rdynamic -ldl -laio -lm

VP9 libvpx Encoding

This is a standard video encoding performance test of Google's libvpx library and the vpxenc command for the VP9 video format. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterVP9 libvpx Encoding 1.10.0Speed: Speed 5 - Input: Bosphorus 4K16 vCPUs32 vCPUs8 vCPUs246810SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 36.686.996.111. (CXX) g++ options: -lm -lpthread -O3 -march=native -march=armv8-a -fPIC -U_FORTIFY_SOURCE -std=gnu++11

OpenBenchmarking.orgFrames Per Second, More Is BetterVP9 libvpx Encoding 1.10.0Speed: Speed 0 - Input: Bosphorus 1080p16 vCPUs32 vCPUs8 vCPUs1.12282.24563.36844.49125.614SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 34.844.994.651. (CXX) g++ options: -lm -lpthread -O3 -march=native -march=armv8-a -fPIC -U_FORTIFY_SOURCE -std=gnu++11

OpenBenchmarking.orgFrames Per Second, More Is BetterVP9 libvpx Encoding 1.10.0Speed: Speed 5 - Input: Bosphorus 1080p16 vCPUs32 vCPUs8 vCPUs3691215SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 311.7812.1111.271. (CXX) g++ options: -lm -lpthread -O3 -march=native -march=armv8-a -fPIC -U_FORTIFY_SOURCE -std=gnu++11

libavif avifenc

This is a test of the AOMedia libavif library testing the encoding of a JPEG image to AV1 Image Format (AVIF). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.10Encoder Speed: 016 vCPUs32 vCPUs8 vCPUs100200300400500SE +/- 0.80, N = 3SE +/- 0.65, N = 3SE +/- 0.90, N = 3328.97266.34456.241. (CXX) g++ options: -O3 -fPIC -march=native -lm

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.10Encoder Speed: 216 vCPUs32 vCPUs8 vCPUs50100150200250SE +/- 0.47, N = 3SE +/- 0.13, N = 3SE +/- 0.32, N = 3194.77169.64245.601. (CXX) g++ options: -O3 -fPIC -march=native -lm

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.10Encoder Speed: 616 vCPUs32 vCPUs8 vCPUs510152025SE +/- 0.037, N = 3SE +/- 0.020, N = 3SE +/- 0.130, N = 311.1686.68220.2731. (CXX) g++ options: -O3 -fPIC -march=native -lm

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.10Encoder Speed: 6, Lossless16 vCPUs32 vCPUs8 vCPUs612182430SE +/- 0.19, N = 3SE +/- 0.00, N = 3SE +/- 0.11, N = 314.7010.3423.311. (CXX) g++ options: -O3 -fPIC -march=native -lm

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.10Encoder Speed: 10, Lossless16 vCPUs32 vCPUs8 vCPUs3691215SE +/- 0.065, N = 8SE +/- 0.072, N = 3SE +/- 0.096, N = 37.6586.7759.9971. (CXX) g++ options: -O3 -fPIC -march=native -lm

Timed Gem5 Compilation

This test times how long it takes to compile Gem5. Gem5 is a simulator for computer system architecture research. Gem5 is widely used for computer architecture research within the industry, academia, and more. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Gem5 Compilation 21.2Time To Compile16 vCPUs32 vCPUs8 vCPUs2004006008001000SE +/- 1.16, N = 3SE +/- 1.96, N = 3SE +/- 0.66, N = 3495.54312.12917.39

OpenSSL

OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.0Algorithm: SHA25616 vCPUs32 vCPUs8 vCPUs6000M12000M18000M24000M30000MSE +/- 19283388.31, N = 3SE +/- 119493320.18, N = 3SE +/- 19026629.44, N = 3129264115272578891991364560835071. (CC) gcc options: -pthread -O3 -march=native -lssl -lcrypto -ldl

OpenBenchmarking.orgsign/s, More Is BetterOpenSSL 3.0Algorithm: RSA409616 vCPUs32 vCPUs8 vCPUs30060090012001500SE +/- 0.06, N = 3SE +/- 0.06, N = 3SE +/- 0.07, N = 3786.71570.2393.71. (CC) gcc options: -pthread -O3 -march=native -lssl -lcrypto -ldl

OpenBenchmarking.orgverify/s, More Is BetterOpenSSL 3.0Algorithm: RSA409616 vCPUs32 vCPUs8 vCPUs30K60K90K120K150KSE +/- 8.35, N = 3SE +/- 29.86, N = 3SE +/- 10.26, N = 364247.0128273.132136.81. (CC) gcc options: -pthread -O3 -march=native -lssl -lcrypto -ldl

Apache Spark

This is a benchmark of Apache Spark with its PySpark interface. Apache Spark is an open-source unified analytics engine for large-scale data processing and dealing with big data. This test profile benchmars the Apache Spark in a single-system configuration using spark-submit. The test makes use of DIYBigData's pyspark-benchmark (https://github.com/DIYBigData/pyspark-benchmark/) for generating of test data and various Apache Spark operations. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 100 - SHA-512 Benchmark Time16 vCPUs32 vCPUs8 vCPUs246810SE +/- 0.05, N = 12SE +/- 0.11, N = 15SE +/- 0.03, N = 34.934.796.28

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 100 - Calculate Pi Benchmark16 vCPUs32 vCPUs8 vCPUs60120180240300SE +/- 0.11, N = 12SE +/- 0.06, N = 15SE +/- 0.17, N = 3137.7669.77277.89

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 100 - Calculate Pi Benchmark Using Dataframe16 vCPUs32 vCPUs8 vCPUs48121620SE +/- 0.02, N = 12SE +/- 0.01, N = 15SE +/- 0.02, N = 38.394.7915.92

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 100 - Repartition Test Time16 vCPUs32 vCPUs8 vCPUs1.03052.0613.09154.1225.1525SE +/- 0.03, N = 12SE +/- 0.03, N = 15SE +/- 0.02, N = 32.552.014.58

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 100 - Inner Join Test Time16 vCPUs32 vCPUs8 vCPUs0.7831.5662.3493.1323.915SE +/- 0.03, N = 12SE +/- 0.02, N = 15SE +/- 0.04, N = 32.222.133.48

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 100 - Broadcast Inner Join Test Time16 vCPUs32 vCPUs8 vCPUs0.64351.2871.93052.5743.2175SE +/- 0.04, N = 12SE +/- 0.03, N = 15SE +/- 0.03, N = 31.791.682.86

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 2000 - SHA-512 Benchmark Time16 vCPUs32 vCPUs8 vCPUs246810SE +/- 0.05, N = 3SE +/- 0.04, N = 15SE +/- 0.08, N = 35.914.968.09

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 2000 - Calculate Pi Benchmark16 vCPUs32 vCPUs8 vCPUs60120180240300SE +/- 0.20, N = 3SE +/- 0.06, N = 15SE +/- 0.35, N = 3137.1769.92278.47

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 2000 - Calculate Pi Benchmark Using Dataframe16 vCPUs32 vCPUs8 vCPUs48121620SE +/- 0.03, N = 3SE +/- 0.01, N = 15SE +/- 0.07, N = 38.294.8015.81

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 2000 - Group By Test Time16 vCPUs32 vCPUs8 vCPUs246810SE +/- 0.10, N = 3SE +/- 0.05, N = 15SE +/- 0.13, N = 37.436.728.85

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 2000 - Repartition Test Time16 vCPUs32 vCPUs8 vCPUs1.27582.55163.82745.10326.379SE +/- 0.01, N = 3SE +/- 0.03, N = 15SE +/- 0.01, N = 33.362.605.67

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 2000 - Inner Join Test Time16 vCPUs32 vCPUs8 vCPUs1.34552.6914.03655.3826.7275SE +/- 0.11, N = 3SE +/- 0.04, N = 15SE +/- 0.09, N = 33.652.875.98

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 2000 - Broadcast Inner Join Test Time16 vCPUs32 vCPUs8 vCPUs1.16552.3313.49654.6625.8275SE +/- 0.05, N = 3SE +/- 0.02, N = 15SE +/- 0.07, N = 32.652.125.18

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 100 - SHA-512 Benchmark Time16 vCPUs32 vCPUs8 vCPUs20406080100SE +/- 0.22, N = 3SE +/- 0.45, N = 9SE +/- 0.85, N = 351.4046.3093.67

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 100 - Calculate Pi Benchmark16 vCPUs32 vCPUs8 vCPUs60120180240300SE +/- 0.06, N = 3SE +/- 0.08, N = 9SE +/- 0.14, N = 3136.8569.57278.37

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 100 - Calculate Pi Benchmark Using Dataframe16 vCPUs32 vCPUs8 vCPUs48121620SE +/- 0.01, N = 3SE +/- 0.01, N = 9SE +/- 0.05, N = 38.404.7615.65

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 100 - Group By Test Time16 vCPUs32 vCPUs8 vCPUs1122334455SE +/- 0.24, N = 3SE +/- 0.16, N = 9SE +/- 0.53, N = 335.6427.6450.87

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 100 - Repartition Test Time16 vCPUs32 vCPUs8 vCPUs1530456075SE +/- 0.92, N = 3SE +/- 0.12, N = 9SE +/- 0.25, N = 337.2224.3668.68

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 100 - Inner Join Test Time16 vCPUs32 vCPUs8 vCPUs20406080100SE +/- 0.72, N = 3SE +/- 0.44, N = 9SE +/- 0.19, N = 344.6230.3280.02

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 100 - Broadcast Inner Join Test Time16 vCPUs32 vCPUs8 vCPUs20406080100SE +/- 0.44, N = 3SE +/- 0.26, N = 9SE +/- 0.22, N = 345.2931.9880.26

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 2000 - SHA-512 Benchmark Time16 vCPUs32 vCPUs8 vCPUs20406080100SE +/- 0.08, N = 3SE +/- 0.55, N = 12SE +/- 0.52, N = 351.5539.2289.23

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 2000 - Calculate Pi Benchmark16 vCPUs32 vCPUs8 vCPUs60120180240300SE +/- 0.17, N = 3SE +/- 0.11, N = 12SE +/- 0.13, N = 3137.0469.79277.83

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 2000 - Calculate Pi Benchmark Using Dataframe16 vCPUs32 vCPUs8 vCPUs48121620SE +/- 0.03, N = 3SE +/- 0.02, N = 12SE +/- 0.00, N = 38.344.7815.71

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 2000 - Group By Test Time16 vCPUs32 vCPUs8 vCPUs1020304050SE +/- 0.23, N = 3SE +/- 0.32, N = 12SE +/- 0.57, N = 330.7022.8445.61

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 2000 - Repartition Test Time16 vCPUs32 vCPUs8 vCPUs1530456075SE +/- 0.20, N = 3SE +/- 0.24, N = 12SE +/- 0.36, N = 335.4522.2266.27

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 2000 - Inner Join Test Time16 vCPUs32 vCPUs8 vCPUs20406080100SE +/- 0.73, N = 3SE +/- 0.19, N = 12SE +/- 1.11, N = 344.3928.6678.00

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 2000 - Broadcast Inner Join Test Time16 vCPUs32 vCPUs8 vCPUs20406080100SE +/- 0.40, N = 3SE +/- 0.17, N = 12SE +/- 0.33, N = 342.7526.5574.71

Facebook RocksDB

This is a benchmark of Facebook's RocksDB as an embeddable persistent key-value store for fast storage based on Google's LevelDB. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 7.0.1Test: Random Read16 vCPUs32 vCPUs8 vCPUs30M60M90M120M150MSE +/- 735054.27, N = 3SE +/- 376574.31, N = 3SE +/- 252880.06, N = 362048967124704201310556891. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 7.0.1Test: Read While Writing16 vCPUs32 vCPUs8 vCPUs600K1200K1800K2400K3000KSE +/- 20446.66, N = 15SE +/- 32390.32, N = 12SE +/- 9067.95, N = 15126482626109925947021. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 7.0.1Test: Read Random Write Random16 vCPUs32 vCPUs8 vCPUs300K600K900K1200K1500KSE +/- 1701.74, N = 3SE +/- 9643.50, N = 15SE +/- 4976.00, N = 1588470013218275483531. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

Apache Cassandra

This is a benchmark of the Apache Cassandra NoSQL database management system making use of cassandra-stress. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgOp/s, More Is BetterApache Cassandra 4.0Test: Writes16 vCPUs32 vCPUs8 vCPUs20K40K60K80K100KSE +/- 256.55, N = 3SE +/- 777.36, N = 3SE +/- 136.95, N = 10392968781917862

PostgreSQL pgbench

This is a benchmark of PostgreSQL using pgbench for facilitating the database benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 14.0Scaling Factor: 100 - Clients: 100 - Mode: Read Only16 vCPUs32 vCPUs8 vCPUs70K140K210K280K350KSE +/- 697.61, N = 3SE +/- 1811.74, N = 3SE +/- 663.61, N = 3157894329539542371. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lm

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL pgbench 14.0Scaling Factor: 100 - Clients: 100 - Mode: Read Only - Average Latency16 vCPUs32 vCPUs8 vCPUs0.41490.82981.24471.65962.0745SE +/- 0.003, N = 3SE +/- 0.002, N = 3SE +/- 0.023, N = 30.6330.3041.8441. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lm

Blender

Blender is an open-source 3D creation software project. This test is of Blender's Cycles benchmark with various sample files. GPU computing is supported. This system/blender test profile makes use of the system-supplied Blender. Use pts/blender if wishing to stick to a fixed version of Blender. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterBlenderBlend File: Fishy Cat - Compute: CPU-Only16 vCPUs32 vCPUs8 vCPUs2004006008001000SE +/- 0.85, N = 3SE +/- 0.42, N = 3SE +/- 1.74, N = 3426.04214.41841.18

OpenBenchmarking.orgSeconds, Fewer Is BetterBlenderBlend File: Classroom - Compute: CPU-Only16 vCPUs32 vCPUs8 vCPUs2004006008001000SE +/- 0.22, N = 3SE +/- 0.07, N = 3SE +/- 1.99, N = 3506.10249.891016.66

OpenBenchmarking.orgSeconds, Fewer Is BetterBlenderBlend File: BMW27 - Compute: CPU-Only16 vCPUs32 vCPUs8 vCPUs100200300400500SE +/- 0.50, N = 3SE +/- 0.10, N = 3SE +/- 0.04, N = 3226.26112.47447.71

PostgreSQL pgbench

This is a benchmark of PostgreSQL using pgbench for facilitating the database benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 14.0Scaling Factor: 100 - Clients: 250 - Mode: Read Only16 vCPUs32 vCPUs8 vCPUs70K140K210K280K350KSE +/- 1418.81, N = 3SE +/- 4561.68, N = 12SE +/- 588.20, N = 12131607312239496281. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lm

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL pgbench 14.0Scaling Factor: 100 - Clients: 250 - Mode: Read Only - Average Latency16 vCPUs32 vCPUs8 vCPUs1.13512.27023.40534.54045.6755SE +/- 0.021, N = 3SE +/- 0.012, N = 12SE +/- 0.060, N = 121.9000.8035.0451. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lm

102 Results Shown

Stress-NG:
  Futex
  CPU Cache
  CPU Stress
  Matrix Math
  Vector Math
  System V Message Passing
SPECjbb 2015:
  SPECjbb2015-Composite max-jOPS
  SPECjbb2015-Composite critical-jOPS
DaCapo Benchmark
Renaissance:
  Apache Spark Bayes
  Savina Reactors.IO
ASTC Encoder:
  Medium
  Thorough
  Exhaustive
Graph500:
  26:
    bfs median_TEPS
    bfs max_TEPS
    sssp median_TEPS
    sssp max_TEPS
TensorFlow Lite:
  SqueezeNet
  Inception V4
  Mobilenet Float
  Inception ResNet V2
TNN:
  CPU - DenseNet
  CPU - MobileNet v2
GROMACS
LAMMPS Molecular Dynamics Simulator:
  20k Atoms
  Rhodopsin Protein
High Performance Conjugate Gradient
NAS Parallel Benchmarks:
  BT.C
  CG.C
  EP.D
  FT.C
  IS.D
  LU.C
  MG.C
  SP.B
  SP.C
ASKAP:
  tConvolve MT - Gridding
  tConvolve MT - Degridding
  tConvolve MPI - Degridding
  tConvolve MPI - Gridding
  tConvolve OpenMP - Gridding
  tConvolve OpenMP - Degridding
  Hogbom Clean OpenMP
OpenFOAM:
  drivaerFastback, Medium Mesh Size - Mesh Time
  drivaerFastback, Medium Mesh Size - Execution Time
GPAW
Coremark
Aircrack-ng
Timed FFmpeg Compilation
Timed MPlayer Compilation
Sysbench
VP9 libvpx Encoding:
  Speed 5 - Bosphorus 4K
  Speed 0 - Bosphorus 1080p
  Speed 5 - Bosphorus 1080p
libavif avifenc:
  0
  2
  6
  6, Lossless
  10, Lossless
Timed Gem5 Compilation
OpenSSL:
  SHA256
  RSA4096
  RSA4096
Apache Spark:
  1000000 - 100 - SHA-512 Benchmark Time
  1000000 - 100 - Calculate Pi Benchmark
  1000000 - 100 - Calculate Pi Benchmark Using Dataframe
  1000000 - 100 - Repartition Test Time
  1000000 - 100 - Inner Join Test Time
  1000000 - 100 - Broadcast Inner Join Test Time
  1000000 - 2000 - SHA-512 Benchmark Time
  1000000 - 2000 - Calculate Pi Benchmark
  1000000 - 2000 - Calculate Pi Benchmark Using Dataframe
  1000000 - 2000 - Group By Test Time
  1000000 - 2000 - Repartition Test Time
  1000000 - 2000 - Inner Join Test Time
  1000000 - 2000 - Broadcast Inner Join Test Time
  40000000 - 100 - SHA-512 Benchmark Time
  40000000 - 100 - Calculate Pi Benchmark
  40000000 - 100 - Calculate Pi Benchmark Using Dataframe
  40000000 - 100 - Group By Test Time
  40000000 - 100 - Repartition Test Time
  40000000 - 100 - Inner Join Test Time
  40000000 - 100 - Broadcast Inner Join Test Time
  40000000 - 2000 - SHA-512 Benchmark Time
  40000000 - 2000 - Calculate Pi Benchmark
  40000000 - 2000 - Calculate Pi Benchmark Using Dataframe
  40000000 - 2000 - Group By Test Time
  40000000 - 2000 - Repartition Test Time
  40000000 - 2000 - Inner Join Test Time
  40000000 - 2000 - Broadcast Inner Join Test Time
Facebook RocksDB:
  Rand Read
  Read While Writing
  Read Rand Write Rand
Apache Cassandra
PostgreSQL pgbench:
  100 - 100 - Read Only
  100 - 100 - Read Only - Average Latency
Blender:
  Fishy Cat - CPU-Only
  Classroom - CPU-Only
  BMW27 - CPU-Only
PostgreSQL pgbench:
  100 - 250 - Read Only
  100 - 250 - Read Only - Average Latency