This tutorial will explain explode, posexplode, explode_outer and posexplode_outer methods available in Pyspark to flatten (explode) array column.

PySpark: Dataframe Explode



explode function(): explode function can be used to flatten array column values as rows.


posexplode function(): posexplode function works similar to explode function to flatten array column values as rows but it will also return position of the array value as additional column.


explode_outer function(): explode_outer function will work exactly like explode function, only difference will be that explode function will not return records if array is empty but explode_outer function will return such records as well.


posexplode_outer function(): posexplode_outer function will work exactly like posexplode function, only difference will be that posexplode function will not return records if array is empty but posexplode_outer function will return such records as well.