January 28, 2021

Create Keras Docker image with ML libraries

Share this

By Andrei Maksimov

January 28, 2021

docker, h5py, keras, matplotlib, pandas, python, pyyaml, seaborn, sklearn, tensorflow

This tutorial will show how to build an Ubuntu 20.04 LTS-based Keras Docker container for Machine Learning. We’ll install to the environment: Python 3, Jupyter, Keras, Tensorflow, TensorBoard, Pandas, Sklearn, Matplotlib, Seaborn, pyyaml, and h5py. Feel free to extend the image with your libraries. Let’s get started.

Best Machine Learning Books for Beginners and Experts

Best Machine Learning Books for Beginners and Experts

As Machine Learning becomes more and more widespread, both beginners and experts need to stay up to date on the latest advancements. For beginners, check out the best Machine Learning books that can help to get a solid understanding of the basics. For experts, reading these books can help to keep pace with the ever-changing landscape. In either case, a few key reasons for checking out these books can be beneficial.

First, they provide a comprehensive overview of the subject matter, mainly about Machine Learning algorithms. Second, they offer insights from leading experts in the field. And third, they offer concrete advice on how to apply Machine Learning concepts in real-world scenarios. As Machine Learning continues to evolve, there’s no doubt that these books will continue to be essential resources for anyone with prior knowledge looking to stay ahead of the curve.

I’ve updated the container to Ubuntu 20.04 LTS base and improved the speed of the Docker image build process. Now we’re not building OpenCV from source but installing it from apt.

Environment setup is a common question when learning Machine Learning (ML). In this article, I’ll show you how to create your own Docker container, including the following frameworks for a comfortable start:

Those are the TOP 10 widely used Python frameworks for Data Science, and you’ll find most of them in any how-to article on the Internet. In the next article (How to build Python Data Science Docker container based on Anaconda), I’ll show how to build the same image on top of the Anaconda distribution.

Requirements

All you need is Docker and a text editor installed on your system.

Project Structure

Here’s the final project structure:

$ tree -a python_data_science_container
$ python_data_science_container
├── Dockerfile
├── conf
│   └── .jupyter
│       └── jupyter_notebook_config.py
└── run_jupyter.sh
2 directories, 3 files

Dockerfile

All you need to do is to create a project folder and filename Dockerfile inside:

$ mkdir python_data_science_container
$ cd python_data_science_container
$ vim Dockerfile

After that, put the following content to the Dockerfile:

FROM ubuntu:20.04
MAINTAINER "Andrei Maksimov"
ENV DEBIAN_FRONTEND noninteractive
RUN apt-get update && apt-get install -y \
	libopencv-dev \
        python3-pip \
	python3-opencv && \
    rm -rf /var/lib/apt/lists/*
RUN pip3 install tensorflow && \
    pip3 install numpy \
        pandas \
        sklearn \
        matplotlib \
        seaborn \
        jupyter \
        pyyaml \
        h5py && \
    pip3 install keras --no-deps && \
    pip3 install opencv-python && \
    pip3 install imutils
RUN ["mkdir", "notebooks"]
COPY conf/.jupyter /root/.jupyter
COPY run_jupyter.sh /
# Jupyter and Tensorboard ports
EXPOSE 8888 6006
# Store notebooks in this mounted directory
VOLUME /notebooks
CMD ["/run_jupyter.sh"]

You may always find up-to-date examples of my Dockerfile on GitHub, which I use to create my personal Data Science container environment (also available on Docker Hub for free).

Jupyter configuration

As soon as we declare our container and its components, it’s time to prepare a configuration for Jupyter. Create a file jupyter_notebook_config.py with the following content:

# get the config object
c = get_config()
# in-line figure when using Matplotlib
c.IPKernelApp.pylab = 'inline'
c.NotebookApp.ip = '*'
c.NotebookApp.allow_remote_access = True
# do not open a browser window by default when using notebooks
c.NotebookApp.open_browser = False
# No token. Always use jupyter over ssh tunnel
c.NotebookApp.token = ''
c.NotebookApp.notebook_dir = '/notebooks'
# Allow to run Jupyter from root user inside Docker container
c.NotebookApp.allow_root = True

As you can guess from Dockerfile, we’ll put it in /root/.jupyter/ folder during the container build process.

Creating startup script

The last thing we need to do is to create a script run_jupyter.sh, which will launch the Jupiter server inside our container during its starting process. Create a with the following content:

#!/usr/bin/env bash
jupyter notebook "$@"

And make this file executable:

$ chmod +x run_jupyter.sh

This file will be launched inside your container by default each time you start the new one.

Creating container image

The last stage – is the container’s image creation. Just run the following command to build your Docker container from the project directory:

$ docker build -f Dockerfile -t python_data_science_container .

Docker will install all necessary libraries and frameworks inside your container image and make it available for use during the build process.

Running container

Now you have a working container, and it’s time to start it. Create a folder inside your project’s folder where we’ll store all our Jupyter Notebooks with the source code of our projects:

$ mkdir notebooks

And start the container with the following command:

$ docker run -it -p 8888:8888 \
    -p 6006:6006 \
    -d \
    -v $(pwd)/notebooks:/notebooks \
    python_data_science_container

It will start the container and expose Jupyter to port 8888 and Tensorflow Dashboard on port 6006 on your local computer or your server, depending on where you’re executing this command.

Please be aware that this container was created only for local development purposes. I removed authentication on Jupyter in this container, so everybody can connect to port 8888 or 6006 and execute the Python code of cause.

Pre-build Keras Docker Container

If you don’t want to create and maintain your container and the components above will be sufficient for you, please feel free to use my container, which I usually update:

$ docker run -it -p 8888:8888 \
    -p 6006:6006 \
    -d \
    -v $(pwd)/notebooks:/notebooks \
    amaksimov/python_data_science

I hope this article will be helpful for you. If you like the article, please post it on any social media. See you soon!

Andrei Maksimov

I’m a passionate Cloud Infrastructure Architect with more than 15 years of experience in IT.

Any of my posts represent my personal experience and opinion about the topic.

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

Related Posts

AWS Activate – Maximize Your Startup’s Potential
AWS Spot Instance – The most important information
AWS Elastic IP – Everything you need to know
AWS CloudTrail – The most important information

Subscribe now to get the latest updates!

>