Local Installation

This Post will walk through how to download and set-up VirtualBox with Ubuntu. Then we will walk through installing Spark, Python and the Jupiter Notebook on this VirtualBox Ubuntu.

First you will need to download (just click on the names):

Download VirtualBox

VirtualBox is basically going to allow you to have a virtual computer on your own physical computer.

You have to open the download page and you will see some download options. Just click the right host (depends on the machine you are using).

Download VirtualBox

You just double click the downloaded file -> follow the instructions, do everything on the defaults.

Download Ubuntu

Once you downloaded VirtualBox you have to download Ubuntu. Go to the ubuntu website and there are different options to download, but we need the Ubuntu Desktop version.

Download Ubuntu

Configurate VirtualBox

Once you opened the VirtualBox manager, you will click on New that is located on the top left corner. It will ask you the name of the operating system. We will call it myspark. Change type to Linux and the version to ubuntu (64-bit).

Configurate VirtualBox

Click next and you will have to choose the memory size. It depends on the amount of RAM your computer has. Depending on the applications we suggest you 4-8Gb.

choose the memory size

And there is a hard disc. We are going to create a virtual disk. Choose VDI (VirtualBox Disk Image) Type. A fixed size disk may take longer to create on some systems but is often faster to use and thats why we will choose it. Give it 20Gb and click create.

create a virtual disk

Double click on your created VirtualMachine. Eventually you will see a Pop-Up that says Select start-up disk and there you are going to point to Ubuntu, that you downloaded before.

point to Ubuntu

Ubuntu installation

You will see a little Pop-Up that is going to say either Try Ubuntu or Install Ubuntu. So we want to install Ubuntu and it will be only installed on your VirtualMachine. Then download updates while installing Ubuntu. Click continue. On the next page you have to click erase disk and install Ubuntu. Then select your or any Timezone and select the Keyboard layout and give your credentials. And voila Ubuntu is installed.

Ubuntu installation

Python and Spark

First thing we want to do ist to confirm that Python 3.5 (or later) is already on Ubuntu. Select Terminal and if you type ~$ python3 you get Python 3.5…

confirm that Python 3... is already on Ubuntu

Now we are going to install Jupiter Notebook system. For this just execute the following code:

pip3 install jupyter

If it says that pip3 is not installed, give the following code:

sudo apt install python3-pip

Try the previous command again to install Jupiter Notebook. Once this is done just type ~$ jupiter notebook and the Notebook system automatically opens. Copy and paste the link that appear in the terminal.

install Jupiter Notebook

To download Spark open the Apache Spark website and go to the download menu. Choose the same options as on the Screenshot below. If you have a latest version available, you are free to choose it.

open the Apache Spark website

We want the package in the right location. So open the file explorer, cut the package and insert it to your home folder.

We want the package in the right location

Then go to your command line and type this:

sudo tar -zxvf spark(and here you can click on Tab)

This is going to unzip it for us.

Now what we want to do is to tell Python where to find Spark,

export SPARK_HOME='home/ubuntu/saprk-2.1.0-bin-hadoop2.7'

export PATH=$SPARK_HOME:$PATH
export PYTHONPATH=$SPARK_HOME/python:$PYTHONPATH

export PYSPARK_DRIVER-PYTHON="jupyter"
export PYSPARK_DRIVER_PYTHON_OPTS="notebook"
export PYSPARK_PYTHON=python3
close

Verpasse diese Tipps nicht!

Wir senden keinen Spam! Erfahre mehr in unserer Datenschutzerklärung.

Leave a Reply

Deine E-Mail-Adresse wird nicht veröffentlicht.