This tutorial will explain multiple workarounds to flatten (explode) 2 or more array columns in PySpark.

PySpark: Dataframe Multiple Explode



Using arrays_zip function(): array_zip function can be used along with explode function to flatten multiple columns together. This can work with n numbers of array columns. This is our preferred approach to flatten multiple array columns.


Using zip_with function(): zip_with function can be used along with explode function to flatten multiple columns together. This function was added in Spark version 3.1.0. Use coalesce if any array column values are expected to be null else this approach will not give required output.


Using join: Multiple array columns can be flatten individually and then joined to achieve required result. This method will not give required output if there is null values in any array columns.