You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 32 Next »

This site is for helping you getting started with Docker and containers.

 

Docker is a platform for running applications inside containers. If you are familiar with python virtual environments, you can think about a container as a system-wide virtual environment; it provides what feels and looks like a clean ubuntu installation where your application can live and run, without the hassle and overhead of setting up a full-fletched virtual machine.

Containers are ubiquitous in the software industry, so I would recommend taking the time to read the official Docker documentation (https://docs.docker.com/get-started/). However, if you're in a hurry and just want to get a GPU-powered python environment up and running asap, the guide below should help you get started.

Step-by-step guide

1 - Making sure you have access to docker

By default (for security reasons), you should not have access to docker, even if you have access to the host machine. To check whether you have access, log into the host machine and run any docker command, e.g.:

Show docker images/processes
# Get a list of all the docker images
docker images 
# or
docker image list 

# To view a list of docker processes
docker ps

If you get an error message complaining about access rights, it means that you are not part of the docker group on the machine. Ask your supervisor to relay a request to one of the engineers at IDI, who should be able to grant you the necessary privileges.


2 - The Dockerfile - building an image

Our end goal is to make a container from which we can run our own code. However, to achieve this, we need to create something called aimage first. An image is a prototype of a container; it serves as a premade snapshot that can be used to spawn any number of containers. An image is created from something called a Dockerfile, which in its most basic form is just a list of prerequisites you want installed and commands you want to run before every startup. The example below should be a nice starting point.

The Dockerfile - Building an image
# Use the latest tf GPU image as parent. This operation is analogous to inheritance in OOP. 
# The image ships with tensorlfow-gpu and jupyter installed for python 2. It is also  
# configured so that a jupyter server will be launched at container startup. Note that you 
# don't have to use this image as parent. 
FROM tensorflow/tensorflow:latest-gpu 

# Set working directory for container 
WORKDIR /app  

# Make ssh directory (useful for adding ssh keys later) 
RUN mkdir -p /root/.ssh 

# Update repositories 
RUN apt-get update 

# Install git  
RUN apt-get install git -y 

# Install pip3 (parent image only comes with python2 stuff) 
RUN apt-get install python3-pip -y 

# Install your python packages  
RUN pip3 install --upgrade pip 
RUN pip3 install numpy 

# Add more pip installs here. Alternatively move everything to a dedicated requirements file.


The snippet above should be saved as a simple text file called Dockerfile. To build an image from it, I would recommend putting it in a dedicated folder, e.g.:

Move to dedicated folder
mkdir ~/docker/myproject 
mv Dockerfile ~/docker/myproject/Dockerfile

 

Then change working directory to the newly created folder and build the image:

Change working directory and build image
cd ~/docker/myproject 
docker build -t <image_name> . 

 

Note the dot ‘.’ at the end of the command; don't forget it as it tells docker where to look for a Dockerfile. <image_name> is a user-specified name used to identify the created image. By convention, since all images created on the machine is stored in one place, it is common to include your username in the image name; e.g.: olanorm/testproject

If the build command executed smoothly, your image should now be ready.  You can verify this by running:

Show docker images
docker image list

 

It prints a list of all the available images on the current machine. You should find your newly created one at the top.

 

3 - Running the container

Once the image is successfully built we can run a container from it. The docker run command contains many different options that you might want to explore through the official reference. However, to keep things simple, here’s a command for running a container capable of providing a jupyter notebook that can be accessed from the outside:

Run container
 docker run -d -rm --p YYYY:8888 --name <container_name> <image_name>

 

  • -d means that the container will run in detached mode. I.e. it will run in the background while freeing up your current shell. 

  • --rm means that the container will be cleaned up (everything inside it will be deleted) after it has exited. A container exists when its root process (which if you used the tensorflow:latest-gpu parent image is a jupyter notebook) terminates. Skip this flag if you want to keep the container around after it has exited, just remember to clean it up yourself so that you don’t clutter up the system. 

  • -p maps a network port from the host machine to a port in the container. In the command above, we map YYYY, which should be an unused port number of the host machine, to port 8888 in the container, which is where the default jupyter process will listen. 

  • --name sets a name for the container. It is not needed, since the container also gets a hash ID, but it is good practice to mark it with a human readable string as well. For instance, if you called your image olanorm/myproject, you can call the container olanorm_myproject (since slashes are not allowed in container names). This name can be used to access the container when running extra commands in it or when you wish to shut it down. 

  • The final, positional argument is just the name of the image created in the previous section.

 

After executing the run command, docker will print the name of the container, or just a hash if you did not specify one. The container is now running and ready, you should be able to see it by executing:

Show docker process
docker ps

The command should give you a output similar to this: 

 

4 - Connecting to jupyter

Normally, when using jupyter notebook-like apps on a computer, we just run the server and access it through a web browser. However, since the server process is running on a different machine that most likely does not expose the serving port on the network, we need some extra magic to make it accessible. To achieve this, open a new SSH connection to the server that is running your docker container with the following command:

Connect to server
ssh -L XXXX:localhost:YYYY <username>@<hostname>

Where XXXX corresponds to any unused ( >1023) port on your local machine. And YYYY corresponds to the docker host machine port you mapped to 8888 when running the container.

 

If you’re working on windows; you might not be able to run the command above. However, the program you used to establish your initial connection with the server most likely has an option for setting up an SSH tunnel as well. A tutorial on how to do this with PuTTY can be found here.


Now, if you open your favorite web browser and go to localhost:XXXX, you should be met with a jupyter notebook sign-in page like the one shown below.

The last piece of the puzzle is to get past this login screen. One way would be to configure jupyter to run without any authentication. However, it is just as easy to simply authenticate once. To get the token required for logging in, execute the following docker command:

 

Get token for login
docker exec <container name> jupyter notebook list

 

 

 






 

 

Feel free to help us provide a better wiki by sharing your skills and knowledge.

  • No labels