This tutorial will explain how to write from Spark dataframe into various types of files(such as JSON, parquet, ORC and Avro).

PySpark: Dataframe To File(Part 2)

This tutorial will explain how to write from Spark dataframe into various types of files(such as JSON, parquet, ORC and Avro).


Write as JSON file: json() function can be used to write data into JSON file. This functions takes a path to directory where file(s) need to be created.
Write as Parquet file: parquet() function can be used to read data from Parquet file. This functions takes a path to directory where file(s) need to be created.
Write as ORC file: Spark also support ORC file format which is mostly used in Hive. orc() function can be used for this purpose. This functions takes a path to directory where file(s) need to be created.
Write as Avro file: Avro file format is not native to Spark and a spark-avro_x.xx-x.x.x.jar jar is required to be added in Spark library to read/write Avro files. Spark Avro jar can be downloaded from maven repository, here is the link to download spark-avro_2.12-3.0.3.jar.