This lecture is only for people who don't want to use Amazon Web Services, skip it if you plan to use AWS free tier and follow along with the instructions in the next lecture. I highly recommend you learn how to setup Spark on AWS, it is an extremely valuable skill to have, especially for potential employers.

Local Spark Set-up Options 

It is highly recommended that you set-up Spark on Amazon Web Services, since that is the sort of operation you would do in a real-world setting (it doesn't really make sense to install Spark onto a local computer because the whole point of Spark is that your data is too big to be handled by a single local computer.

However if you still really want to install it locally the best way to go about it is to install Ubuntu on your system:

Windows Instructions:

http://www.ubuntu.com/download/desktop/create-a-usb-stick-on-windows

Mac OS Instructions:

http://www.ubuntu.com/download/desktop/create-a-usb-stick-on-mac-osx

Once you have ubuntu installed on your local computer, you can follow the instructions in the lecture titled "PySpark Setup" which is 3 lectures ahead of this one. That shows you how to install Spark and Hadoop on to a local Ubuntu computer. Again, I don't recommend you install it locally, it is much better practice and better for your resume if you understand how to install Spark on an EC2 instance on Amazon Web Services.

Questions? Post them to the QA Forum!

Thanks!

Jose