This tutorial will explain how to use various functions available in DataFrameNaFunctions class to handle null or missing values.

PySpark: Dataframe Handing Nulls


drop: This function inside 'na' class function can be used to remove rows with null values. 'na.drop' and 'dropna' functions are aliases of each other.


fill: This function inside 'na' class or fillna dataframe function can be used to replace null values in dataframe rows. 'na.fill' and 'fillna' functions are aliases of each other.


Filter Null Values: Null values can only be queried using isNull attribute of col function. Rows were fetched where manager_id was null in the below example.
Filter not Null Values: isNotNull attribute of col function can be used to filter out null values.