Apache Spark AMD EPYC 2 x AMD EPYC 7713 64-Core testing with a AMD DAYTONA_X (RYM1009B BIOS) and ASPEED on Ubuntu 22.04 via the Phoronix Test Suite. A: Processor: 2 x AMD EPYC 7713 64-Core @ 2.00GHz (128 Cores / 256 Threads), Motherboard: AMD DAYTONA_X (RYM1009B BIOS), Chipset: AMD Starship/Matisse, Memory: 512GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP, Graphics: ASPEED, Monitor: VE228, Network: 2 x Mellanox MT27710 OS: Ubuntu 22.04, Kernel: 5.19.0-051900daily20220803-generic (x86_64), Desktop: GNOME Shell 42.2, Display Server: X Server 1.21.1.3, Vulkan: 1.2.204, Compiler: GCC 11.2.0, File-System: ext4, Screen Resolution: 1920x1080 B: Processor: 2 x AMD EPYC 7713 64-Core @ 2.00GHz (128 Cores / 256 Threads), Motherboard: AMD DAYTONA_X (RYM1009B BIOS), Chipset: AMD Starship/Matisse, Memory: 512GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP, Graphics: ASPEED, Monitor: VE228, Network: 2 x Mellanox MT27710 OS: Ubuntu 22.04, Kernel: 5.19.0-051900daily20220803-generic (x86_64), Desktop: GNOME Shell 42.2, Display Server: X Server 1.21.1.3, Vulkan: 1.2.204, Compiler: GCC 11.2.0, File-System: ext4, Screen Resolution: 1920x1080 C: Processor: 2 x AMD EPYC 7713 64-Core @ 2.00GHz (128 Cores / 256 Threads), Motherboard: AMD DAYTONA_X (RYM1009B BIOS), Chipset: AMD Starship/Matisse, Memory: 512GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP, Graphics: ASPEED, Monitor: VE228, Network: 2 x Mellanox MT27710 OS: Ubuntu 22.04, Kernel: 5.19.0-051900daily20220803-generic (x86_64), Desktop: GNOME Shell 42.2, Display Server: X Server 1.21.1.3, Vulkan: 1.2.204, Compiler: GCC 11.2.0, File-System: ext4, Screen Resolution: 1920x1080 Apache Spark 3.3 Row Count: 1000000 - Partitions: 100 - SHA-512 Benchmark Time Seconds < Lower Is Better A . 4.54 |==================================================================== B . 4.62 |===================================================================== C . 4.45 |================================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 100 - Calculate Pi Benchmark Seconds < Lower Is Better A . 17.58 |=================================================================== B . 17.53 |=================================================================== C . 17.75 |==================================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 100 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better A . 3.67 |===================================================================== B . 3.12 |=========================================================== C . 3.48 |================================================================= Apache Spark 3.3 Row Count: 1000000 - Partitions: 100 - Group By Test Time Seconds < Lower Is Better A . 6.53 |================================================================= B . 6.78 |==================================================================== C . 6.91 |===================================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 100 - Repartition Test Time Seconds < Lower Is Better A . 2.94 |===================================================================== B . 2.75 |================================================================= C . 2.91 |==================================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 100 - Inner Join Test Time Seconds < Lower Is Better A . 2.92 |=============================================================== B . 3.17 |==================================================================== C . 3.21 |===================================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 100 - Broadcast Inner Join Test Time Seconds < Lower Is Better A . 1.86 |================================================================ B . 1.84 |================================================================ C . 1.99 |===================================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 500 - SHA-512 Benchmark Time Seconds < Lower Is Better A . 5.93 |=================================================================== B . 6.01 |==================================================================== C . 5.58 |=============================================================== C . 5.54 |=============================================================== C . 6.10 |===================================================================== C . 6.05 |==================================================================== C . 6.06 |===================================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 500 - Calculate Pi Benchmark Seconds < Lower Is Better A . 17.47 |=================================================================== B . 17.49 |=================================================================== C . 17.19 |================================================================== C . 17.30 |=================================================================== C . 17.65 |==================================================================== C . 17.66 |==================================================================== C . 17.17 |================================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 500 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better A . 3.18 |========================================================== B . 3.22 |=========================================================== C . 2.93 |===================================================== C . 3.79 |===================================================================== C . 2.87 |==================================================== C . 3.19 |========================================================== C . 2.69 |================================================= Apache Spark 3.3 Row Count: 1000000 - Partitions: 1000 - SHA-512 Benchmark Time Seconds < Lower Is Better A . 5.91 |================================================================ B . 6.36 |===================================================================== C . 6.08 |================================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 1000 - Calculate Pi Benchmark Seconds < Lower Is Better A . 17.45 |=================================================================== B . 17.59 |=================================================================== C . 17.80 |==================================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 1000 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better A . 3.38 |===================================================================== B . 2.46 |================================================== C . 3.03 |============================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 1000 - Group By Test Time Seconds < Lower Is Better A . 11.95 |========================================================= B . 14.25 |==================================================================== C . 11.05 |===================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 1000 - Repartition Test Time Seconds < Lower Is Better A . 5.49 |============================================================= B . 6.25 |===================================================================== C . 6.16 |==================================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 1000 - Inner Join Test Time Seconds < Lower Is Better A . 6.58 |================================================================= B . 6.93 |==================================================================== C . 7.01 |===================================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 1000 - Broadcast Inner Join Test Time Seconds < Lower Is Better A . 3.90 |================================================================= B . 3.97 |================================================================== C . 4.13 |===================================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 2000 - SHA-512 Benchmark Time Seconds < Lower Is Better A . 5.91 |================================================================== B . 6.16 |===================================================================== C . 6.13 |===================================================================== C . 5.91 |================================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 2000 - Calculate Pi Benchmark Seconds < Lower Is Better A . 17.52 |=================================================================== B . 17.55 |=================================================================== C . 17.46 |=================================================================== C . 17.69 |==================================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 2000 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better A . 3.21 |============================================================= B . 3.64 |===================================================================== C . 3.31 |=============================================================== C . 3.25 |============================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 2000 - Group By Test Time Seconds < Lower Is Better A . 11.05 |=============================================================== B . 10.96 |=============================================================== C . 11.86 |==================================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 2000 - Repartition Test Time Seconds < Lower Is Better A . 6.04 |================================================= B . 5.98 |================================================ C . 8.51 |===================================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 2000 - Inner Join Test Time Seconds < Lower Is Better A . 7.18 |================================================================== B . 7.34 |==================================================================== C . 7.47 |===================================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 2000 - Broadcast Inner Join Test Time Seconds < Lower Is Better A . 4.42 |===================================================================== B . 4.04 |=============================================================== C . 4.44 |===================================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 100 - SHA-512 Benchmark Time Seconds < Lower Is Better A . 11.60 |================================================================ B . 11.63 |================================================================= C . 11.50 |================================================================ C . 12.24 |==================================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 100 - Calculate Pi Benchmark Seconds < Lower Is Better A . 17.48 |==================================================================== B . 17.37 |==================================================================== C . 17.39 |==================================================================== C . 17.38 |==================================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 100 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better A . 3.51 |================================================================== B . 3.68 |===================================================================== C . 2.28 |=========================================== C . 2.04 |====================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 100 - Group By Test Time Seconds < Lower Is Better A . 12.32 |=============================================================== B . 13.26 |==================================================================== C . 12.67 |================================================================= Apache Spark 3.3 Row Count: 10000000 - Partitions: 100 - Repartition Test Time Seconds < Lower Is Better A . 7.28 |============================================================ B . 8.37 |===================================================================== C . 8.16 |=================================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 100 - Inner Join Test Time Seconds < Lower Is Better A . 8.51 |=============================================================== B . 9.12 |==================================================================== C . 9.27 |===================================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 100 - Broadcast Inner Join Test Time Seconds < Lower Is Better A . 5.81 |================================================================ B . 6.23 |===================================================================== C . 6.03 |=================================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 500 - SHA-512 Benchmark Time Seconds < Lower Is Better A . 13.29 |==================================================================== B . 12.85 |================================================================== C . 12.72 |================================================================= Apache Spark 3.3 Row Count: 10000000 - Partitions: 500 - Calculate Pi Benchmark Seconds < Lower Is Better A . 17.45 |==================================================================== B . 17.49 |==================================================================== C . 17.48 |==================================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 500 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better A . 3.80 |===================================================================== B . 3.31 |============================================================ C . 3.08 |======================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 100 - SHA-512 Benchmark Time Seconds < Lower Is Better A . 18.83 |=============================================================== B . 19.20 |================================================================= C . 19.49 |================================================================== C . 20.19 |==================================================================== C . 19.33 |================================================================= Apache Spark 3.3 Row Count: 20000000 - Partitions: 100 - Calculate Pi Benchmark Seconds < Lower Is Better A . 17.38 |==================================================================== B . 17.40 |==================================================================== C . 17.40 |==================================================================== C . 17.35 |==================================================================== C . 17.19 |=================================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 100 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better A . 4.25 |===================================================================== B . 3.39 |======================================================= C . 3.62 |=========================================================== C . 2.90 |=============================================== C . 3.40 |======================================================= Apache Spark 3.3 Row Count: 20000000 - Partitions: 500 - SHA-512 Benchmark Time Seconds < Lower Is Better A . 20.37 |==================================================================== B . 20.46 |==================================================================== C . 20.26 |=================================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 500 - Calculate Pi Benchmark Seconds < Lower Is Better A . 17.45 |=================================================================== B . 17.64 |==================================================================== C . 17.26 |=================================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 500 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better A . 3.44 |============================================================== B . 3.82 |===================================================================== C . 3.40 |============================================================= Apache Spark 3.3 Row Count: 40000000 - Partitions: 100 - SHA-512 Benchmark Time Seconds < Lower Is Better A . 34.96 |==================================================================== B . 35.12 |==================================================================== C . 33.94 |================================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 100 - Calculate Pi Benchmark Seconds < Lower Is Better A . 17.54 |=================================================================== B . 17.51 |=================================================================== C . 17.71 |==================================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 100 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better A . 3.44 |===================================================================== B . 3.09 |============================================================== C . 2.53 |=================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 500 - SHA-512 Benchmark Time Seconds < Lower Is Better A . 35.26 |==================================================================== B . 35.03 |==================================================================== C . 35.28 |==================================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 500 - Calculate Pi Benchmark Seconds < Lower Is Better A . 17.67 |==================================================================== B . 17.41 |=================================================================== C . 17.28 |================================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 500 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better A . 3.38 |=========================================================== B . 3.28 |========================================================= C . 3.97 |===================================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 1000 - SHA-512 Benchmark Time Seconds < Lower Is Better A . 12.91 |=================================================================== B . 13.09 |==================================================================== C . 12.53 |================================================================= Apache Spark 3.3 Row Count: 10000000 - Partitions: 1000 - Calculate Pi Benchmark Seconds < Lower Is Better A . 17.44 |=================================================================== B . 17.54 |==================================================================== C . 17.61 |==================================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 1000 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better A . 3.54 |===================================================================== B . 3.22 |=============================================================== C . 2.89 |======================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 2000 - SHA-512 Benchmark Time Seconds < Lower Is Better A . 13.19 |================================================================== B . 13.09 |================================================================= C . 13.45 |=================================================================== C . 13.03 |================================================================= C . 13.69 |==================================================================== C . 13.22 |================================================================== C . 12.62 |=============================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 2000 - Calculate Pi Benchmark Seconds < Lower Is Better A . 17.48 |=================================================================== B . 17.42 |=================================================================== C . 17.11 |================================================================= C . 17.45 |=================================================================== C . 17.22 |================================================================== C . 17.27 |================================================================== C . 17.80 |==================================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 2000 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better A . 3.34 |======================================================== B . 3.22 |====================================================== C . 3.85 |================================================================= C . 3.00 |=================================================== C . 2.56 |=========================================== C . 3.42 |========================================================== C . 4.09 |===================================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 1000 - SHA-512 Benchmark Time Seconds < Lower Is Better A . 20.95 |=================================================================== B . 20.65 |================================================================== C . 21.42 |==================================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 1000 - Calculate Pi Benchmark Seconds < Lower Is Better A . 17.47 |==================================================================== B . 17.50 |==================================================================== C . 17.43 |==================================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 1000 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better A . 3.34 |================================================================= B . 3.57 |===================================================================== C . 2.56 |================================================= Apache Spark 3.3 Row Count: 20000000 - Partitions: 2000 - SHA-512 Benchmark Time Seconds < Lower Is Better A . 20.96 |================================================================== B . 20.47 |================================================================ C . 21.74 |==================================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 2000 - Calculate Pi Benchmark Seconds < Lower Is Better A . 17.42 |=================================================================== B . 17.30 |=================================================================== C . 17.61 |==================================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 2000 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better A . 3.67 |===================================================================== B . 3.62 |==================================================================== C . 3.33 |=============================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 1000 - SHA-512 Benchmark Time Seconds < Lower Is Better A . 35.18 |================================================================== B . 35.55 |=================================================================== C . 36.21 |==================================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 1000 - Calculate Pi Benchmark Seconds < Lower Is Better A . 17.33 |================================================================== B . 17.31 |================================================================== C . 17.84 |==================================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 1000 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better A . 2.90 |===================================================== B . 3.55 |================================================================ C . 3.80 |===================================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 2000 - SHA-512 Benchmark Time Seconds < Lower Is Better A . 35.78 |==================================================================== B . 35.36 |=================================================================== C . 35.07 |=================================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 2000 - Calculate Pi Benchmark Seconds < Lower Is Better A . 17.63 |==================================================================== B . 17.54 |==================================================================== C . 17.31 |=================================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 2000 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better A . 2.04000000 |============================= B . 3.43000000 |================================================= C . 4.36702926 |=============================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 500 - Group By Test Time Seconds < Lower Is Better B . 15.62 |==================================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 500 - Repartition Test Time Seconds < Lower Is Better B . 4.87 |===================================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 500 - Inner Join Test Time Seconds < Lower Is Better B . 5.59 |===================================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 500 - Broadcast Inner Join Test Time Seconds < Lower Is Better B . 4.49 |===================================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 100 - Group By Test Time Seconds < Lower Is Better B . 37.61 |================================================================= C . 39.26 |==================================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 100 - Repartition Test Time Seconds < Lower Is Better B . 13.48 |================================================================== C . 13.86 |==================================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 100 - Inner Join Test Time Seconds < Lower Is Better B . 14.25 |==================================================================== C . 12.76 |============================================================= Apache Spark 3.3 Row Count: 20000000 - Partitions: 100 - Broadcast Inner Join Test Time Seconds < Lower Is Better B . 12.02 |==================================================================== C . 10.87 |============================================================= Apache Spark 3.3 Row Count: 10000000 - Partitions: 2000 - Group By Test Time Seconds < Lower Is Better B . 18.19 |==================================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 2000 - Repartition Test Time Seconds < Lower Is Better B . 14.44 |==================================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 2000 - Inner Join Test Time Seconds < Lower Is Better B . 12.76 |==================================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 2000 - Broadcast Inner Join Test Time Seconds < Lower Is Better B . 8.18 |=====================================================================