Tau T2A 8 16 32 vCPU Scaling

Benchmarks by Michael Larabel for a future article

HTML result view exported from: https://openbenchmarking.org/result/2208120-PTS-2208123N58&sor&grs.

ProcessorMotherboardMemoryDiskNetworkOSKernelCompilerFile-SystemSystem LayerTau T2A 8 vCPUs 16 vCPUs 32 vCPUsARMv8 Neoverse-N1 (8 Cores)KVM Google Compute Engine32GB215GB nvme_card-pdGoogle Compute Engine VirtualUbuntu 22.045.15.0-1013-gcp (aarch64)GCC 12.0.1 20220319ext4KVMARMv8 Neoverse-N1 (16 Cores)64GBARMv8 Neoverse-N1 (32 Cores)128GB5.15.0-1016-gcp (aarch64)OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseEnvironment Details- CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Compiler Details- --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v Java Details- OpenJDK Runtime Environment (build 11.0.16+8-post-Ubuntu-0ubuntu122.04)Python Details- Python 3.10.4Security Details- Tau T2A: 8 vCPUs: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected - Tau T2A: 16 vCPUs: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected - Tau T2A: 32 vCPUs: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected

pgbench: 100 - 250 - Read Onlypgbench: 100 - 250 - Read Only - Average Latencypgbench: 100 - 100 - Read Onlypgbench: 100 - 100 - Read Only - Average Latencyspec-jbb2015: SPECjbb2015-Composite critical-jOPScassandra: Writesnpb: BT.Cnpb: SP.Bblender: Classroom - CPU-Onlyastcenc: Thoroughaircrack-ng: astcenc: Exhaustiverocksdb: Rand Readcoremark: CoreMark Size 666 - Iterations Per Secondspark: 40000000 - 100 - Calculate Pi Benchmarkopenssl: SHA256openssl: RSA4096openssl: RSA4096spark: 1000000 - 100 - Calculate Pi Benchmarkspark: 1000000 - 2000 - Calculate Pi Benchmarkspark: 40000000 - 2000 - Calculate Pi Benchmarkblender: BMW27 - CPU-Onlynpb: EP.Dstress-ng: CPU Stresssysbench: CPUstress-ng: Matrix Mathstress-ng: Vector Mathblender: Fishy Cat - CPU-Onlyspec-jbb2015: SPECjbb2015-Composite max-jOPSgromacs: MPI CPU - water_GMX50_bareaskap: tConvolve OpenMP - Degriddingnpb: SP.Clammps: 20k Atomslammps: Rhodopsin Proteinspark: 1000000 - 100 - Calculate Pi Benchmark Using Dataframespark: 1000000 - 2000 - Calculate Pi Benchmark Using Dataframespark: 40000000 - 100 - Calculate Pi Benchmark Using Dataframespark: 40000000 - 2000 - Calculate Pi Benchmark Using Dataframeaskap: tConvolve OpenMP - Griddingnpb: CG.Ctensorflow-lite: Inception V4build-mplayer: Time To Compileavifenc: 6spark: 40000000 - 2000 - Repartition Test Timebuild-gem5: Time To Compilegpaw: Carbon Nanotubebuild-ffmpeg: Time To Compilespark: 40000000 - 100 - Repartition Test Timenpb: FT.Cspark: 40000000 - 2000 - Broadcast Inner Join Test Timenpb: LU.Cspark: 40000000 - 2000 - Inner Join Test Timetensorflow-lite: Inception ResNet V2askap: Hogbom Clean OpenMPspark: 40000000 - 100 - Inner Join Test Timeaskap: tConvolve MT - Degriddingspark: 40000000 - 100 - Broadcast Inner Join Test Timespark: 1000000 - 2000 - Broadcast Inner Join Test Timeopenfoam: drivaerFastback, Medium Mesh Size - Execution Timerocksdb: Read Rand Write Randspark: 1000000 - 100 - Repartition Test Timespark: 40000000 - 2000 - SHA-512 Benchmark Timeavifenc: 6, Losslessspark: 1000000 - 2000 - Repartition Test Timetensorflow-lite: Mobilenet Floatspark: 1000000 - 2000 - Inner Join Test Timeopenfoam: drivaerFastback, Medium Mesh Size - Mesh Timespark: 40000000 - 100 - SHA-512 Benchmark Timespark: 40000000 - 2000 - Group By Test Timehpcg: askap: tConvolve MPI - Griddinggraph500: 26askap: tConvolve MT - Griddinggraph500: 26spark: 40000000 - 100 - Group By Test Timenpb: MG.Cgraph500: 26graph500: 26tensorflow-lite: SqueezeNetavifenc: 0npb: IS.Dspark: 1000000 - 100 - Inner Join Test Timespark: 1000000 - 2000 - SHA-512 Benchmark Timeastcenc: Mediumavifenc: 10, Losslessavifenc: 2stress-ng: System V Message Passingspark: 1000000 - 2000 - Group By Test Timestress-ng: CPU Cachetnn: CPU - DenseNetvpxenc: Speed 5 - Bosphorus 4Kvpxenc: Speed 5 - Bosphorus 1080pvpxenc: Speed 0 - Bosphorus 1080ptnn: CPU - MobileNet v2rocksdb: Read While Writingstress-ng: Futexaskap: tConvolve MPI - Degriddingspark: 1000000 - 100 - Broadcast Inner Join Test Timespark: 1000000 - 100 - SHA-512 Benchmark Timerenaissance: Savina Reactors.IOrenaissance: Apache Spark Bayesdacapobench: TradesoapTau T2A 8 vCPUs 16 vCPUs 32 vCPUs496285.045542371.84439211786214368.297338.981016.6629.05058308.580276.879831055689175037.765143278.366845677645608350732136.8393.7277.89278.472527229277.83447.71820.942065.5327237.2838215.9524633.99841.1891580.452421.437115.284.6624.81215.9215.8115.6515.712296.106855.8197646.188.83820.27366.27917.393381.196113.45568.6818574.2374.7132029.1478.0091592.6371.29580.022196.7480.265.182426.165483534.5889.2323.3125.674395.735.98425.9593.6745.6111.09691977.892360.6350.8727703.336618.32456.2371104.263.488.099.05059.997245.6014507844.178.85436.313842.1236.1111.274.65331.344594702937451.991325.062.866.2826456.42249.880401316071.9001578940.63392073929649125.9319552.45506.1014.214616697.924137.621862048967351562.543158136.8496130191292641152764247.0786.7137.761285190137.173976199137.040207724226.261634.994116.9654317.4276177.5649102.30426.04180920.8805023.719710.908.4998.8618.398.298.408.343631.8112171.9546113.447.62111.16835.45495.541208.96561.98337.2232644.8542.7555447.3144.3945445.9645.16144.624083.1645.292.651534.728847002.5551.5514.7023.362481.963.65303.7151.4030.7017.09843343.252625630003789.0125747800035.6433309.7695265500707502003955.89328.9701498.452.225.916.94497.658194.7745475267.367.43551.253358.7286.6811.784.84328.88612648261198681.872585.311.794.9315981.51262.055983122390.8033295390.304229558781969530.6434381.91249.897.161933647.54868.6557124704201700917.94473769.5725788919913128273.11570.269.7769.9269.79112.473265.688209.47108241.61151792.8397749.08214.41350751.7189181.2426843.5816.55016.5964.794.804.764.787262.7421433.9231657.328.9286.68222.22312.120130.35338.95824.3652309.8126.5587702.3028.6633994.9996.70030.325522.0731.982.12994.5313218272.0139.2210.3412.602093.252.87206.446.3022.8422.09303899.285083720004456.5547737700027.6450939.051695420001247020003853.90266.3371822.772.134.965.98256.775169.6396128517.106.72566.913056.8976.9912.114.99322.76826109921437660.623962.081.684.7910705.9766.45015OpenBenchmarking.org

PostgreSQL pgbench

Scaling Factor: 100 - Clients: 250 - Mode: Read Only

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 14.0Scaling Factor: 100 - Clients: 250 - Mode: Read Only32 vCPUs16 vCPUs8 vCPUs70K140K210K280K350KSE +/- 4561.68, N = 12SE +/- 1418.81, N = 3SE +/- 588.20, N = 12312239131607496281. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lm

PostgreSQL pgbench

Scaling Factor: 100 - Clients: 250 - Mode: Read Only - Average Latency

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL pgbench 14.0Scaling Factor: 100 - Clients: 250 - Mode: Read Only - Average Latency32 vCPUs16 vCPUs8 vCPUs1.13512.27023.40534.54045.6755SE +/- 0.012, N = 12SE +/- 0.021, N = 3SE +/- 0.060, N = 120.8031.9005.0451. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lm

PostgreSQL pgbench

Scaling Factor: 100 - Clients: 100 - Mode: Read Only

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 14.0Scaling Factor: 100 - Clients: 100 - Mode: Read Only32 vCPUs16 vCPUs8 vCPUs70K140K210K280K350KSE +/- 1811.74, N = 3SE +/- 697.61, N = 3SE +/- 663.61, N = 3329539157894542371. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lm

PostgreSQL pgbench

Scaling Factor: 100 - Clients: 100 - Mode: Read Only - Average Latency

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL pgbench 14.0Scaling Factor: 100 - Clients: 100 - Mode: Read Only - Average Latency32 vCPUs16 vCPUs8 vCPUs0.41490.82981.24471.65962.0745SE +/- 0.002, N = 3SE +/- 0.003, N = 3SE +/- 0.023, N = 30.3040.6331.8441. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lm

SPECjbb 2015

SPECjbb2015-Composite critical-jOPS

OpenBenchmarking.orgjOPS, More Is BetterSPECjbb 2015SPECjbb2015-Composite critical-jOPS32 vCPUs16 vCPUs8 vCPUs5K10K15K20K25K2295592073921

Apache Cassandra

Test: Writes

OpenBenchmarking.orgOp/s, More Is BetterApache Cassandra 4.0Test: Writes32 vCPUs16 vCPUs8 vCPUs20K40K60K80K100KSE +/- 777.36, N = 3SE +/- 256.55, N = 3SE +/- 136.95, N = 10878193929617862

NAS Parallel Benchmarks

Test / Class: BT.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: BT.C32 vCPUs16 vCPUs8 vCPUs15K30K45K60K75KSE +/- 272.46, N = 3SE +/- 18.18, N = 3SE +/- 23.11, N = 369530.6449125.9314368.291. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

NAS Parallel Benchmarks

Test / Class: SP.B

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: SP.B32 vCPUs16 vCPUs8 vCPUs7K14K21K28K35KSE +/- 38.20, N = 3SE +/- 244.58, N = 3SE +/- 17.11, N = 334381.9119552.457338.981. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

Blender

Blend File: Classroom - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlenderBlend File: Classroom - Compute: CPU-Only32 vCPUs16 vCPUs8 vCPUs2004006008001000SE +/- 0.07, N = 3SE +/- 0.22, N = 3SE +/- 1.99, N = 3249.89506.101016.66

ASTC Encoder

Preset: Thorough

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 3.2Preset: Thorough32 vCPUs16 vCPUs8 vCPUs714212835SE +/- 0.0033, N = 3SE +/- 0.0106, N = 3SE +/- 0.0316, N = 37.161914.214629.05051. (CXX) g++ options: -O3 -march=native -flto -pthread

Aircrack-ng

OpenBenchmarking.orgk/s, More Is BetterAircrack-ng 1.732 vCPUs16 vCPUs8 vCPUs7K14K21K28K35KSE +/- 287.54, N = 15SE +/- 192.97, N = 15SE +/- 103.85, N = 1533647.5516697.928308.58-lpcre-lpcre1. (CXX) g++ options: -std=gnu++17 -O3 -fvisibility=hidden -fcommon -rdynamic -lnl-3 -lnl-genl-3 -lpthread -lz -lssl -lcrypto -lhwloc -ldl -lm -pthread

ASTC Encoder

Preset: Exhaustive

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 3.2Preset: Exhaustive32 vCPUs16 vCPUs8 vCPUs60120180240300SE +/- 0.08, N = 3SE +/- 0.04, N = 3SE +/- 3.08, N = 368.66137.62276.881. (CXX) g++ options: -O3 -march=native -flto -pthread

Facebook RocksDB

Test: Random Read

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 7.0.1Test: Random Read32 vCPUs16 vCPUs8 vCPUs30M60M90M120M150MSE +/- 376574.31, N = 3SE +/- 735054.27, N = 3SE +/- 252880.06, N = 312470420162048967310556891. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

Coremark

CoreMark Size 666 - Iterations Per Second

OpenBenchmarking.orgIterations/Sec, More Is BetterCoremark 1.0CoreMark Size 666 - Iterations Per Second32 vCPUs16 vCPUs8 vCPUs150K300K450K600K750KSE +/- 385.56, N = 3SE +/- 87.09, N = 3SE +/- 85.46, N = 3700917.94351562.54175037.771. (CC) gcc options: -O2 -O3 -march=native -lrt" -lrt

Apache Spark

Row Count: 40000000 - Partitions: 100 - Calculate Pi Benchmark

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 100 - Calculate Pi Benchmark32 vCPUs16 vCPUs8 vCPUs60120180240300SE +/- 0.08, N = 9SE +/- 0.06, N = 3SE +/- 0.14, N = 369.57136.85278.37

OpenSSL

Algorithm: SHA256

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.0Algorithm: SHA25632 vCPUs16 vCPUs8 vCPUs6000M12000M18000M24000M30000MSE +/- 119493320.18, N = 3SE +/- 19283388.31, N = 3SE +/- 19026629.44, N = 3257889199131292641152764560835071. (CC) gcc options: -pthread -O3 -march=native -lssl -lcrypto -ldl

OpenSSL

Algorithm: RSA4096

OpenBenchmarking.orgverify/s, More Is BetterOpenSSL 3.0Algorithm: RSA409632 vCPUs16 vCPUs8 vCPUs30K60K90K120K150KSE +/- 29.86, N = 3SE +/- 8.35, N = 3SE +/- 10.26, N = 3128273.164247.032136.81. (CC) gcc options: -pthread -O3 -march=native -lssl -lcrypto -ldl

OpenSSL

Algorithm: RSA4096

OpenBenchmarking.orgsign/s, More Is BetterOpenSSL 3.0Algorithm: RSA409632 vCPUs16 vCPUs8 vCPUs30060090012001500SE +/- 0.06, N = 3SE +/- 0.06, N = 3SE +/- 0.07, N = 31570.2786.7393.71. (CC) gcc options: -pthread -O3 -march=native -lssl -lcrypto -ldl

Apache Spark

Row Count: 1000000 - Partitions: 100 - Calculate Pi Benchmark

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 100 - Calculate Pi Benchmark32 vCPUs16 vCPUs8 vCPUs60120180240300SE +/- 0.06, N = 15SE +/- 0.11, N = 12SE +/- 0.17, N = 369.77137.76277.89

Apache Spark

Row Count: 1000000 - Partitions: 2000 - Calculate Pi Benchmark

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 2000 - Calculate Pi Benchmark32 vCPUs16 vCPUs8 vCPUs60120180240300SE +/- 0.06, N = 15SE +/- 0.20, N = 3SE +/- 0.35, N = 369.92137.17278.47

Apache Spark

Row Count: 40000000 - Partitions: 2000 - Calculate Pi Benchmark

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 2000 - Calculate Pi Benchmark32 vCPUs16 vCPUs8 vCPUs60120180240300SE +/- 0.11, N = 12SE +/- 0.17, N = 3SE +/- 0.13, N = 369.79137.04277.83

Blender

Blend File: BMW27 - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlenderBlend File: BMW27 - Compute: CPU-Only32 vCPUs16 vCPUs8 vCPUs100200300400500SE +/- 0.10, N = 3SE +/- 0.50, N = 3SE +/- 0.04, N = 3112.47226.26447.71

NAS Parallel Benchmarks

Test / Class: EP.D

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: EP.D32 vCPUs16 vCPUs8 vCPUs7001400210028003500SE +/- 2.04, N = 3SE +/- 1.03, N = 3SE +/- 0.56, N = 33265.681634.99820.941. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

Stress-NG

Test: CPU Stress

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.14Test: CPU Stress32 vCPUs16 vCPUs8 vCPUs2K4K6K8K10KSE +/- 4.23, N = 3SE +/- 2.80, N = 3SE +/- 1.28, N = 38209.474116.962065.531. (CC) gcc options: -O3 -march=native -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread

Sysbench

Test: CPU

OpenBenchmarking.orgEvents Per Second, More Is BetterSysbench 1.0.20Test: CPU32 vCPUs16 vCPUs8 vCPUs20K40K60K80K100KSE +/- 23.77, N = 3SE +/- 12.70, N = 3SE +/- 6.95, N = 3108241.6154317.4227237.281. (CC) gcc options: -O2 -funroll-loops -O3 -march=native -rdynamic -ldl -laio -lm

Stress-NG

Test: Matrix Math

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.14Test: Matrix Math32 vCPUs16 vCPUs8 vCPUs30K60K90K120K150KSE +/- 9.80, N = 3SE +/- 10.44, N = 3SE +/- 25.04, N = 3151792.8376177.5638215.951. (CC) gcc options: -O3 -march=native -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread

Stress-NG

Test: Vector Math

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.14Test: Vector Math32 vCPUs16 vCPUs8 vCPUs20K40K60K80K100KSE +/- 190.70, N = 3SE +/- 27.43, N = 3SE +/- 6.31, N = 397749.0849102.3024633.991. (CC) gcc options: -O3 -march=native -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread

Blender

Blend File: Fishy Cat - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlenderBlend File: Fishy Cat - Compute: CPU-Only32 vCPUs16 vCPUs8 vCPUs2004006008001000SE +/- 0.42, N = 3SE +/- 0.85, N = 3SE +/- 1.74, N = 3214.41426.04841.18

SPECjbb 2015

SPECjbb2015-Composite max-jOPS

OpenBenchmarking.orgjOPS, More Is BetterSPECjbb 2015SPECjbb2015-Composite max-jOPS32 vCPUs16 vCPUs8 vCPUs8K16K24K32K40K35075180929158

GROMACS

Implementation: MPI CPU - Input: water_GMX50_bare

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2022.1Implementation: MPI CPU - Input: water_GMX50_bare32 vCPUs16 vCPUs8 vCPUs0.38660.77321.15981.54641.933SE +/- 0.010, N = 3SE +/- 0.001, N = 3SE +/- 0.000, N = 31.7180.8800.4501. (CXX) g++ options: -O3 -march=native

ASKAP

Test: tConvolve OpenMP - Degridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve OpenMP - Degridding32 vCPUs16 vCPUs8 vCPUs2K4K6K8K10KSE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 33.24, N = 39181.245023.702421.431. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

NAS Parallel Benchmarks

Test / Class: SP.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: SP.C32 vCPUs16 vCPUs8 vCPUs6K12K18K24K30KSE +/- 31.60, N = 3SE +/- 112.56, N = 3SE +/- 39.91, N = 326843.5819710.907115.281. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

LAMMPS Molecular Dynamics Simulator

Model: 20k Atoms

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 23Jun2022Model: 20k Atoms32 vCPUs16 vCPUs8 vCPUs48121620SE +/- 0.004, N = 3SE +/- 0.112, N = 3SE +/- 0.025, N = 316.5508.4994.6621. (CXX) g++ options: -O3 -march=native -ldl

LAMMPS Molecular Dynamics Simulator

Model: Rhodopsin Protein

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 23Jun2022Model: Rhodopsin Protein32 vCPUs16 vCPUs8 vCPUs48121620SE +/- 0.012, N = 3SE +/- 0.037, N = 3SE +/- 0.011, N = 316.5968.8614.8121. (CXX) g++ options: -O3 -march=native -ldl

Apache Spark

Row Count: 1000000 - Partitions: 100 - Calculate Pi Benchmark Using Dataframe

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 100 - Calculate Pi Benchmark Using Dataframe32 vCPUs16 vCPUs8 vCPUs48121620SE +/- 0.01, N = 15SE +/- 0.02, N = 12SE +/- 0.02, N = 34.798.3915.92

Apache Spark

Row Count: 1000000 - Partitions: 2000 - Calculate Pi Benchmark Using Dataframe

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 2000 - Calculate Pi Benchmark Using Dataframe32 vCPUs16 vCPUs8 vCPUs48121620SE +/- 0.01, N = 15SE +/- 0.03, N = 3SE +/- 0.07, N = 34.808.2915.81

Apache Spark

Row Count: 40000000 - Partitions: 100 - Calculate Pi Benchmark Using Dataframe

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 100 - Calculate Pi Benchmark Using Dataframe32 vCPUs16 vCPUs8 vCPUs48121620SE +/- 0.01, N = 9SE +/- 0.01, N = 3SE +/- 0.05, N = 34.768.4015.65

Apache Spark

Row Count: 40000000 - Partitions: 2000 - Calculate Pi Benchmark Using Dataframe

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 2000 - Calculate Pi Benchmark Using Dataframe32 vCPUs16 vCPUs8 vCPUs48121620SE +/- 0.02, N = 12SE +/- 0.03, N = 3SE +/- 0.00, N = 34.788.3415.71

ASKAP

Test: tConvolve OpenMP - Gridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve OpenMP - Gridding32 vCPUs16 vCPUs8 vCPUs16003200480064008000SE +/- 66.63, N = 3SE +/- 43.40, N = 3SE +/- 29.91, N = 37262.743631.812296.101. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

NAS Parallel Benchmarks

Test / Class: CG.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: CG.C32 vCPUs16 vCPUs8 vCPUs5K10K15K20K25KSE +/- 35.67, N = 3SE +/- 171.15, N = 3SE +/- 49.63, N = 1521433.9212171.956855.811. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

TensorFlow Lite

Model: Inception V4

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Inception V432 vCPUs16 vCPUs8 vCPUs20K40K60K80K100KSE +/- 149.01, N = 3SE +/- 84.24, N = 3SE +/- 49.87, N = 331657.346113.497646.1

Timed MPlayer Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed MPlayer Compilation 1.5Time To Compile32 vCPUs16 vCPUs8 vCPUs20406080100SE +/- 0.35, N = 4SE +/- 0.26, N = 3SE +/- 0.03, N = 328.9347.6288.84

libavif avifenc

Encoder Speed: 6

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.10Encoder Speed: 632 vCPUs16 vCPUs8 vCPUs510152025SE +/- 0.020, N = 3SE +/- 0.037, N = 3SE +/- 0.130, N = 36.68211.16820.2731. (CXX) g++ options: -O3 -fPIC -march=native -lm

Apache Spark

Row Count: 40000000 - Partitions: 2000 - Repartition Test Time

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 2000 - Repartition Test Time32 vCPUs16 vCPUs8 vCPUs1530456075SE +/- 0.24, N = 12SE +/- 0.20, N = 3SE +/- 0.36, N = 322.2235.4566.27

Timed Gem5 Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Gem5 Compilation 21.2Time To Compile32 vCPUs16 vCPUs8 vCPUs2004006008001000SE +/- 1.96, N = 3SE +/- 1.16, N = 3SE +/- 0.66, N = 3312.12495.54917.39

GPAW

Input: Carbon Nanotube

OpenBenchmarking.orgSeconds, Fewer Is BetterGPAW 22.1Input: Carbon Nanotube32 vCPUs16 vCPUs8 vCPUs80160240320400SE +/- 0.30, N = 3SE +/- 0.03, N = 3SE +/- 0.63, N = 3130.35208.97381.201. (CC) gcc options: -shared -fwrapv -O2 -O3 -march=native -lxc -lblas -lmpi

Timed FFmpeg Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed FFmpeg Compilation 4.4Time To Compile32 vCPUs16 vCPUs8 vCPUs306090120150SE +/- 0.16, N = 3SE +/- 0.09, N = 3SE +/- 0.18, N = 338.9661.98113.46

Apache Spark

Row Count: 40000000 - Partitions: 100 - Repartition Test Time

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 100 - Repartition Test Time32 vCPUs16 vCPUs8 vCPUs1530456075SE +/- 0.12, N = 9SE +/- 0.92, N = 3SE +/- 0.25, N = 324.3637.2268.68

NAS Parallel Benchmarks

Test / Class: FT.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: FT.C32 vCPUs16 vCPUs8 vCPUs11K22K33K44K55KSE +/- 41.18, N = 3SE +/- 300.01, N = 3SE +/- 15.96, N = 352309.8132644.8518574.231. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

Apache Spark

Row Count: 40000000 - Partitions: 2000 - Broadcast Inner Join Test Time

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 2000 - Broadcast Inner Join Test Time32 vCPUs16 vCPUs8 vCPUs20406080100SE +/- 0.17, N = 12SE +/- 0.40, N = 3SE +/- 0.33, N = 326.5542.7574.71

NAS Parallel Benchmarks

Test / Class: LU.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: LU.C32 vCPUs16 vCPUs8 vCPUs20K40K60K80K100KSE +/- 137.48, N = 3SE +/- 701.24, N = 3SE +/- 50.76, N = 387702.3055447.3132029.141. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

Apache Spark

Row Count: 40000000 - Partitions: 2000 - Inner Join Test Time

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 2000 - Inner Join Test Time32 vCPUs16 vCPUs8 vCPUs20406080100SE +/- 0.19, N = 12SE +/- 0.73, N = 3SE +/- 1.11, N = 328.6644.3978.00

TensorFlow Lite

Model: Inception ResNet V2

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Inception ResNet V232 vCPUs16 vCPUs8 vCPUs20K40K60K80K100KSE +/- 379.42, N = 3SE +/- 16.18, N = 3SE +/- 70.02, N = 333994.945445.991592.6

ASKAP

Test: Hogbom Clean OpenMP

OpenBenchmarking.orgIterations Per Second, More Is BetterASKAP 1.0Test: Hogbom Clean OpenMP32 vCPUs16 vCPUs8 vCPUs2004006008001000SE +/- 3.30, N = 3SE +/- 0.00, N = 3SE +/- 1.21, N = 3996.70645.16371.301. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

Apache Spark

Row Count: 40000000 - Partitions: 100 - Inner Join Test Time

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 100 - Inner Join Test Time32 vCPUs16 vCPUs8 vCPUs20406080100SE +/- 0.44, N = 9SE +/- 0.72, N = 3SE +/- 0.19, N = 330.3244.6280.02

ASKAP

Test: tConvolve MT - Degridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve MT - Degridding32 vCPUs16 vCPUs8 vCPUs12002400360048006000SE +/- 80.56, N = 15SE +/- 2.61, N = 3SE +/- 7.87, N = 35522.074083.162196.741. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

Apache Spark

Row Count: 40000000 - Partitions: 100 - Broadcast Inner Join Test Time

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 100 - Broadcast Inner Join Test Time32 vCPUs16 vCPUs8 vCPUs20406080100SE +/- 0.26, N = 9SE +/- 0.44, N = 3SE +/- 0.22, N = 331.9845.2980.26

Apache Spark

Row Count: 1000000 - Partitions: 2000 - Broadcast Inner Join Test Time

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 2000 - Broadcast Inner Join Test Time32 vCPUs16 vCPUs8 vCPUs1.16552.3313.49654.6625.8275SE +/- 0.02, N = 15SE +/- 0.05, N = 3SE +/- 0.07, N = 32.122.655.18

OpenFOAM

Input: drivaerFastback, Medium Mesh Size - Execution Time

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 9Input: drivaerFastback, Medium Mesh Size - Execution Time32 vCPUs16 vCPUs8 vCPUs5001000150020002500994.531534.722426.16-lfoamToVTK -ldynamicMesh -llagrangian -lfileFormats-lfoamToVTK -ldynamicMesh -llagrangian -lfileFormats-ltransportModels -lspecie -lfiniteVolume -lfvModels -lmeshTools -lsampling1. (CXX) g++ options: -std=c++14 -O3 -mcpu=native -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lgenericPatchFields -lOpenFOAM -ldl -lm

Facebook RocksDB

Test: Read Random Write Random

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 7.0.1Test: Read Random Write Random32 vCPUs16 vCPUs8 vCPUs300K600K900K1200K1500KSE +/- 9643.50, N = 15SE +/- 1701.74, N = 3SE +/- 4976.00, N = 1513218278847005483531. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

Apache Spark

Row Count: 1000000 - Partitions: 100 - Repartition Test Time

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 100 - Repartition Test Time32 vCPUs16 vCPUs8 vCPUs1.03052.0613.09154.1225.1525SE +/- 0.03, N = 15SE +/- 0.03, N = 12SE +/- 0.02, N = 32.012.554.58

Apache Spark

Row Count: 40000000 - Partitions: 2000 - SHA-512 Benchmark Time

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 2000 - SHA-512 Benchmark Time32 vCPUs16 vCPUs8 vCPUs20406080100SE +/- 0.55, N = 12SE +/- 0.08, N = 3SE +/- 0.52, N = 339.2251.5589.23

libavif avifenc

Encoder Speed: 6, Lossless

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.10Encoder Speed: 6, Lossless32 vCPUs16 vCPUs8 vCPUs612182430SE +/- 0.00, N = 3SE +/- 0.19, N = 3SE +/- 0.11, N = 310.3414.7023.311. (CXX) g++ options: -O3 -fPIC -march=native -lm

Apache Spark

Row Count: 1000000 - Partitions: 2000 - Repartition Test Time

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 2000 - Repartition Test Time32 vCPUs16 vCPUs8 vCPUs1.27582.55163.82745.10326.379SE +/- 0.03, N = 15SE +/- 0.01, N = 3SE +/- 0.01, N = 32.603.365.67

TensorFlow Lite

Model: Mobilenet Float

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Mobilenet Float32 vCPUs16 vCPUs8 vCPUs9001800270036004500SE +/- 17.55, N = 3SE +/- 4.32, N = 3SE +/- 3.06, N = 32093.252481.964395.73

Apache Spark

Row Count: 1000000 - Partitions: 2000 - Inner Join Test Time

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 2000 - Inner Join Test Time32 vCPUs16 vCPUs8 vCPUs1.34552.6914.03655.3826.7275SE +/- 0.04, N = 15SE +/- 0.11, N = 3SE +/- 0.09, N = 32.873.655.98

OpenFOAM

Input: drivaerFastback, Medium Mesh Size - Mesh Time

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 9Input: drivaerFastback, Medium Mesh Size - Mesh Time32 vCPUs16 vCPUs8 vCPUs90180270360450206.40303.71425.95-lfoamToVTK -ldynamicMesh -llagrangian -lfileFormats-lfoamToVTK -ldynamicMesh -llagrangian -lfileFormats-ltransportModels -lspecie -lfiniteVolume -lfvModels -lmeshTools -lsampling1. (CXX) g++ options: -std=c++14 -O3 -mcpu=native -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lgenericPatchFields -lOpenFOAM -ldl -lm

Apache Spark

Row Count: 40000000 - Partitions: 100 - SHA-512 Benchmark Time

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 100 - SHA-512 Benchmark Time32 vCPUs16 vCPUs8 vCPUs20406080100SE +/- 0.45, N = 9SE +/- 0.22, N = 3SE +/- 0.85, N = 346.3051.4093.67

Apache Spark

Row Count: 40000000 - Partitions: 2000 - Group By Test Time

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 2000 - Group By Test Time32 vCPUs16 vCPUs8 vCPUs1020304050SE +/- 0.32, N = 12SE +/- 0.23, N = 3SE +/- 0.57, N = 322.8430.7045.61

High Performance Conjugate Gradient

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.132 vCPUs16 vCPUs8 vCPUs510152025SE +/- 0.01, N = 3SE +/- 0.03, N = 3SE +/- 0.00, N = 322.0917.1011.101. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

ASKAP

Test: tConvolve MPI - Gridding

OpenBenchmarking.orgMpix/sec, More Is BetterASKAP 1.0Test: tConvolve MPI - Gridding32 vCPUs16 vCPUs8 vCPUs8001600240032004000SE +/- 42.99, N = 15SE +/- 32.26, N = 3SE +/- 23.42, N = 153899.283343.251977.891. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

Graph500

Scale: 26

OpenBenchmarking.orgbfs max_TEPS, More Is BetterGraph500 3.0Scale: 2632 vCPUs16 vCPUs110M220M330M440M550M5083720002625630001. (CC) gcc options: -fcommon -O3 -march=native -lpthread -lm -lmpi

ASKAP

Test: tConvolve MT - Gridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve MT - Gridding32 vCPUs16 vCPUs8 vCPUs10002000300040005000SE +/- 35.89, N = 15SE +/- 5.95, N = 3SE +/- 5.73, N = 34456.553789.012360.631. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

Graph500

Scale: 26

OpenBenchmarking.orgbfs median_TEPS, More Is BetterGraph500 3.0Scale: 2632 vCPUs16 vCPUs100M200M300M400M500M4773770002574780001. (CC) gcc options: -fcommon -O3 -march=native -lpthread -lm -lmpi

Apache Spark

Row Count: 40000000 - Partitions: 100 - Group By Test Time

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 40000000 - Partitions: 100 - Group By Test Time32 vCPUs16 vCPUs8 vCPUs1122334455SE +/- 0.16, N = 9SE +/- 0.24, N = 3SE +/- 0.53, N = 327.6435.6450.87

NAS Parallel Benchmarks

Test / Class: MG.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: MG.C32 vCPUs16 vCPUs8 vCPUs11K22K33K44K55KSE +/- 31.40, N = 3SE +/- 102.49, N = 3SE +/- 46.23, N = 350939.0533309.7627703.331. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

Graph500

Scale: 26

OpenBenchmarking.orgsssp max_TEPS, More Is BetterGraph500 3.0Scale: 2632 vCPUs16 vCPUs40M80M120M160M200M169542000952655001. (CC) gcc options: -fcommon -O3 -march=native -lpthread -lm -lmpi

Graph500

Scale: 26

OpenBenchmarking.orgsssp median_TEPS, More Is BetterGraph500 3.0Scale: 2632 vCPUs16 vCPUs30M60M90M120M150M124702000707502001. (CC) gcc options: -fcommon -O3 -march=native -lpthread -lm -lmpi

TensorFlow Lite

Model: SqueezeNet

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: SqueezeNet32 vCPUs16 vCPUs8 vCPUs14002800420056007000SE +/- 31.57, N = 8SE +/- 11.05, N = 3SE +/- 9.96, N = 33853.903955.896618.32

libavif avifenc

Encoder Speed: 0

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.10Encoder Speed: 032 vCPUs16 vCPUs8 vCPUs100200300400500SE +/- 0.65, N = 3SE +/- 0.80, N = 3SE +/- 0.90, N = 3266.34328.97456.241. (CXX) g++ options: -O3 -fPIC -march=native -lm

NAS Parallel Benchmarks

Test / Class: IS.D

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: IS.D32 vCPUs16 vCPUs8 vCPUs400800120016002000SE +/- 0.86, N = 3SE +/- 14.70, N = 3SE +/- 1.14, N = 31822.771498.451104.261. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

Apache Spark

Row Count: 1000000 - Partitions: 100 - Inner Join Test Time

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 100 - Inner Join Test Time32 vCPUs16 vCPUs8 vCPUs0.7831.5662.3493.1323.915SE +/- 0.02, N = 15SE +/- 0.03, N = 12SE +/- 0.04, N = 32.132.223.48

Apache Spark

Row Count: 1000000 - Partitions: 2000 - SHA-512 Benchmark Time

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 2000 - SHA-512 Benchmark Time32 vCPUs16 vCPUs8 vCPUs246810SE +/- 0.04, N = 15SE +/- 0.05, N = 3SE +/- 0.08, N = 34.965.918.09

ASTC Encoder

Preset: Medium

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 3.2Preset: Medium32 vCPUs16 vCPUs8 vCPUs3691215SE +/- 0.0035, N = 3SE +/- 0.0194, N = 3SE +/- 0.0253, N = 35.98256.94499.05051. (CXX) g++ options: -O3 -march=native -flto -pthread

libavif avifenc

Encoder Speed: 10, Lossless

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.10Encoder Speed: 10, Lossless32 vCPUs16 vCPUs8 vCPUs3691215SE +/- 0.072, N = 3SE +/- 0.065, N = 8SE +/- 0.096, N = 36.7757.6589.9971. (CXX) g++ options: -O3 -fPIC -march=native -lm

libavif avifenc

Encoder Speed: 2

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.10Encoder Speed: 232 vCPUs16 vCPUs8 vCPUs50100150200250SE +/- 0.13, N = 3SE +/- 0.47, N = 3SE +/- 0.32, N = 3169.64194.77245.601. (CXX) g++ options: -O3 -fPIC -march=native -lm

Stress-NG

Test: System V Message Passing

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.14Test: System V Message Passing32 vCPUs16 vCPUs8 vCPUs1.3M2.6M3.9M5.2M6.5MSE +/- 7551.56, N = 3SE +/- 15929.45, N = 3SE +/- 12538.93, N = 36128517.105475267.364507844.171. (CC) gcc options: -O3 -march=native -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread

Apache Spark

Row Count: 1000000 - Partitions: 2000 - Group By Test Time

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 2000 - Group By Test Time32 vCPUs16 vCPUs8 vCPUs246810SE +/- 0.05, N = 15SE +/- 0.10, N = 3SE +/- 0.13, N = 36.727.438.85

Stress-NG

Test: CPU Cache

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.14Test: CPU Cache32 vCPUs16 vCPUs8 vCPUs120240360480600SE +/- 0.28, N = 3SE +/- 2.05, N = 3SE +/- 2.30, N = 3566.91551.25436.311. (CC) gcc options: -O3 -march=native -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread

TNN

Target: CPU - Model: DenseNet

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: DenseNet32 vCPUs16 vCPUs8 vCPUs8001600240032004000SE +/- 6.90, N = 3SE +/- 12.43, N = 3SE +/- 9.75, N = 33056.903358.733842.12MIN: 2928.19 / MAX: 3237.58MIN: 3163.2 / MAX: 3575.85MIN: 3619.38 / MAX: 4060.161. (CXX) g++ options: -O3 -march=native -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

VP9 libvpx Encoding

Speed: Speed 5 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterVP9 libvpx Encoding 1.10.0Speed: Speed 5 - Input: Bosphorus 4K32 vCPUs16 vCPUs8 vCPUs246810SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 36.996.686.111. (CXX) g++ options: -lm -lpthread -O3 -march=native -march=armv8-a -fPIC -U_FORTIFY_SOURCE -std=gnu++11

VP9 libvpx Encoding

Speed: Speed 5 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterVP9 libvpx Encoding 1.10.0Speed: Speed 5 - Input: Bosphorus 1080p32 vCPUs16 vCPUs8 vCPUs3691215SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 312.1111.7811.271. (CXX) g++ options: -lm -lpthread -O3 -march=native -march=armv8-a -fPIC -U_FORTIFY_SOURCE -std=gnu++11

VP9 libvpx Encoding

Speed: Speed 0 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterVP9 libvpx Encoding 1.10.0Speed: Speed 0 - Input: Bosphorus 1080p32 vCPUs16 vCPUs8 vCPUs1.12282.24563.36844.49125.614SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 34.994.844.651. (CXX) g++ options: -lm -lpthread -O3 -march=native -march=armv8-a -fPIC -U_FORTIFY_SOURCE -std=gnu++11

TNN

Target: CPU - Model: MobileNet v2

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: MobileNet v232 vCPUs16 vCPUs8 vCPUs70140210280350SE +/- 0.05, N = 3SE +/- 1.36, N = 3SE +/- 0.78, N = 3322.77328.89331.34MIN: 319.63 / MAX: 326.43MIN: 322.15 / MAX: 373.8MIN: 327.36 / MAX: 339.941. (CXX) g++ options: -O3 -march=native -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

Facebook RocksDB

Test: Read While Writing

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 7.0.1Test: Read While Writing32 vCPUs16 vCPUs8 vCPUs600K1200K1800K2400K3000KSE +/- 32390.32, N = 12SE +/- 20446.66, N = 15SE +/- 9067.95, N = 15261099212648265947021. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

Stress-NG

Test: Futex

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.14Test: Futex32 vCPUs16 vCPUs8 vCPUs300K600K900K1200K1500KSE +/- 15026.23, N = 3SE +/- 30917.20, N = 15SE +/- 36001.77, N = 151437660.621198681.87937451.991. (CC) gcc options: -O3 -march=native -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread

ASKAP

Test: tConvolve MPI - Degridding

OpenBenchmarking.orgMpix/sec, More Is BetterASKAP 1.0Test: tConvolve MPI - Degridding32 vCPUs16 vCPUs8 vCPUs8001600240032004000SE +/- 54.84, N = 15SE +/- 12.80, N = 3SE +/- 24.91, N = 153962.082585.311325.061. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

Apache Spark

Row Count: 1000000 - Partitions: 100 - Broadcast Inner Join Test Time

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 100 - Broadcast Inner Join Test Time32 vCPUs16 vCPUs8 vCPUs0.64351.2871.93052.5743.2175SE +/- 0.03, N = 15SE +/- 0.04, N = 12SE +/- 0.03, N = 31.681.792.86

Apache Spark

Row Count: 1000000 - Partitions: 100 - SHA-512 Benchmark Time

OpenBenchmarking.orgSeconds, Fewer Is BetterApache Spark 3.3Row Count: 1000000 - Partitions: 100 - SHA-512 Benchmark Time32 vCPUs16 vCPUs8 vCPUs246810SE +/- 0.11, N = 15SE +/- 0.05, N = 12SE +/- 0.03, N = 34.794.936.28

Renaissance

Test: Savina Reactors.IO

OpenBenchmarking.orgms, Fewer Is BetterRenaissance 0.14Test: Savina Reactors.IO32 vCPUs16 vCPUs8 vCPUs6K12K18K24K30KSE +/- 131.70, N = 4SE +/- 583.25, N = 12SE +/- 435.60, N = 910705.915981.526456.4MIN: 10505.49 / MAX: 14847.21MIN: 12776.53 / MAX: 36273.51MIN: 13667.16 / MAX: 42318.14

Renaissance

Test: Apache Spark Bayes

OpenBenchmarking.orgms, Fewer Is BetterRenaissance 0.14Test: Apache Spark Bayes32 vCPUs16 vCPUs8 vCPUs5001000150020002500SE +/- 9.73, N = 3SE +/- 7.45, N = 3SE +/- 42.77, N = 15766.41262.02249.8MIN: 495.95 / MAX: 1178.88MIN: 877.37 / MAX: 1398.23MIN: 1478.18 / MAX: 2434.18

DaCapo Benchmark

Java Test: Tradesoap

OpenBenchmarking.orgmsec, Fewer Is BetterDaCapo Benchmark 9.12-MR1Java Test: Tradesoap32 vCPUs16 vCPUs8 vCPUs2K4K6K8K10KSE +/- 95.95, N = 20SE +/- 52.52, N = 4SE +/- 68.63, N = 4501555988040


Phoronix Test Suite v10.8.5