This tutorial will give high level overview of Spark and how to setup Spark/Pyspark on Mac.

PySpark: Overview and setup(Mac)

This tutorial will give high level overview of Spark and how to setup Spark/Pyspark on Mac. Spark is an in-memory processing framework which support almost all the system ranging from HDFS to cloud storage as well. Spark is much faster than mapreduce(Hadoop) because of below reasons:


Hadoop Setup: If user want to use Hive using Spark then please complete hadoop setup first. Steps are available on Hadoop Setup Page.

Hive Setup: If user want to use Hive using Spark then please complete hive setup first. Steps are available on Hive Setup Page.

PySpark setup(Mac)

Spark Setup: Click here to download Spark binary or download required version directly from apache website https://spark.apache.org/downloads.html. Place & extract the Spark package in $HOME/hadoop directory.

Pyspark Shell
Spark Version