JupyterHub-Fastbook

➕ ➕

Taking fast.ai's Practical Deep Learning for Coders course notebooks repository and putting it into a Docker container.

Pre-loaded with Jupyter and all the required dependencies (installed in a conda environment) for an all-in-one automated, repeatable deployment without any setup.

➕

For those that lead a team, scale out by deploying the environment to multiple users at once via JupyterHub, hosted on your own Kubernetes cluster.

This is a standalone deployment which can be extended or used as-is for your own multi-user Jupyter workflows.

*See the Further Reading section for more details on the above mentioned technologies.

Table of Contents

Quickstart
- Running The Docker Image Locally
- Deploying JupyterHub to Your Kubernetes Cluster
Overview
Advanced Usage
JupyterHub Kubernetes Deployment Explanation
- Setup
- Deployment
Further Reading

Quickstart

Running The Docker Image Locally

# Note: the `latest` tag is used here for expediency. When possible, you should
# pin your version by specifying an exact Docker image tag,
# e.g., `TAG=v20201007-7890c25`
TAG=latest
docker run -p 8888:8888 teozosa/jupyterhub-fastbook:${TAG}

Note: This will automatically pull the image from Docker Hub if it is not already present on your machine; it is fairly large (~5 GB), so this may take awhile.

Follow the directions on-screen to log in to your local Jupyter notebook environment! 🎉

Note: the first URL may not work. If that happens, try the URL beginning with http://127.0.0.1

Important: When running the fast.ai notebooks, be sure to switch the notebook kernel to the `fastbook` environment

Deploying JupyterHub to Your Kubernetes Cluster

Please see the unabridged Kubernetes deployment section for an in-depth explanations of the below steps

From the root of your repository, on the command line, run:

# Generate and store secret token for later usage
echo "export PROXY_SECRET=$(openssl rand -hex 32)" > .env

# Install Helm
curl https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 | bash
# Verify Helm
helm list
# Add JupyterHub Helm charts
helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/
helm repo update

# Deploy the JupyterHub service mesh onto your Kubernetes cluster
# using the secret token you generated in step 1.
# Note: the `latest` tag is used here for expediency. When possible, you should
# pin your version by specifying an exact Docker image tag,
# e.g., `TAG=v20201007-7890c25`
make deploy TAG=latest

You should then be greeted by a Helm messages similar to the below

Check that all the pods are running

kubectl --namespace jhub get all

Get the JupyterHub server address

JUPYTERHUB_IP=$(kubectl --namespace jhub get service proxy-public -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
echo $JUPYTERHUB_IP

Type the IP from the previous step into your browser, login^*, and you should now be in the JupyterLab UI! 🎉

^{* JupyterHub is running with a default dummy authenticator so entering any username and password combination will let you enter the hub.}

Important: When running the fast.ai notebooks, be sure to switch the notebook kernel to the `fastbook` environment

Overview

Benefits

Immediately get started on the fast.ai Practical Deep Learning for Coders course without any extra setup via the JupyterHub-Fastbook Docker image^[1]
Deploy JupyterHub (with the JupyterHub-Fastbook Docker image):
- To your Kubernetes cluster^[0] via the official Helm chart.
- [Optional] Using Github Oauth for user authentication
Roll your own JupyterHub deployment:
- Use the deployment as-is; you get a fully-featured JupyterHub deployment that just so happens to have fast.ai's Practical Deep Learning for Coders course dependencies pre-loaded.
- Extend the configuration and deployment system in this project for your particular needs.
- Build and push your own JupyterHub-Fastbook images to your own Docker registry.

^{[0] Tested with Microk8s on Ubuntu 18.04.4.}

^{[1] Based on the official jupyter/minimal-notebook from Jupyter Docker Stacks. This means you get the same features of a default JupyterHub deployment with the added functionality of an isolated fastbook conda environment.}

Example Uses

Use JupyterHub-Fastbook in conjunction with the fast.ai Practical Deep Learning for Coders course:

To go through the course on your own with virtually no setup by running the JupyterHub-Fastbook Docker image locally.
As the basis for a study group
To onboard new junior members of your organization's AI/ML team

Or anything else you can think of!

Why This Project?

The purpose of this project was to reduce any initial technical barriers to entry for the fast.ai Practical Deep Learning for Coders course by automating the setup, configuration, and maintenance of a compatible programming environment, scaling that experience to both individuals and groups of individuals.

In the same spirit as the course, if you don't need a PhD to build AI applications, you also shouldn't need to be a DevOps expert to get started with the course.

We've done all the work for you. All you need to do is dive in and get started!

Technical Notes

When running the Docker image as a container in single-user mode, outside of Kubernetes, you will interact directly with the Jupyter Notebook interface (see: Quickstart: Running the Docker image locally).
The JupyterHub Kubernetes deployment portion of this project is based on the official Zero to JupyterHub with Kubernetes guide and assumes you have your own Kubernetes cluster already set up. If not and you are just starting out, Minikube is great for local development and Microk8s works well for single-node clusters.

Advanced Usage

Makefile Overview

Available rules

build               Build Docker container
config.yaml         Generate JupyterHub Helm chart configuration file
deploy              Deploy JupyterHub to your Kubernetes cluster
push                Push image to Docker Hub container registry

Tip: invoking make without any arguments will display auto-generated documentation similar to the above.

Build and Push Your Own Docker Image

In addition to deployment, the makefile contains facilities to build and push Docker images to your own repository. Simply edit the appropriate fields in Makefile and invoke make with one of: build, push.

Enabling GitHub Oauth^[2]

Determine your JupyterHub host address (the address you use in your browser to access JupyterHub) and add it to your `.env` file

JUPYTERHUB_IP=$(kubectl --namespace jhub get service proxy-public -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
echo "export JUPYTERHUB_IP=${JUPYTERHUB_IP}" >> .env

Generate your GitHub Oauth credentials and add them to your `.env` file

Follow this tutorial: GitHub documentation: Building OAuth Apps - Creating an OAuth App, then:

GITHUB_CLIENT_ID=$YOUR_GITHUB_CLIENT_ID
GITHUB_CLIENT_SECRET=$YOUR_GITHUB_CLIENT_SECRET
echo "export GITHUB_CLIENT_ID=${GITHUB_CLIENT_ID}" >> .env
echo "export GITHUB_CLIENT_SECRET=${GITHUB_CLIENT_SECRET}" >> .env

Redeploy your JupyterHub instance

# Note: the `latest` tag is used here for expediency. When possible, you should
# pin your version by specifying an exact Docker image tag,
# e.g., `TAG=v20201007-7890c25`
make deploy TAG=latest

Now, the first time a user logs in to your JupyterHub instance, they will be greeted by a screen that looks like this:

Once they click "Authorize", users will now automatically be authenticated via GitHub's Oauth whenever they log in.

^{[2] see: JupyterHub documentation: Authenticating with OAuth2 - GitHub}

JupyterHub Kubernetes Deployment Explanation

Setup

source: JupyterHub documentation: Setting up JupyterHub

Note: commands in this section should be run on the command line from the root of your repository.

Generate a secret token for your JupyterHub deployment and place it in your local `.env` file

echo "export PROXY_SECRET=$(openssl rand -hex 32)" > .env

DANGER! DO NOT VERSION CONTROL THIS FILE!

If you need to store these values in version control, consider using something like SOPS.

Install Helm

source: JupyterHub documentation: Setting up Helm

Download and install

curl https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 | bash

Verify installation and add JupyterHub Helm charts:

# Verify Helm
helm list
# Add JupyterHub Helm charts
helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/
helm repo update

Deployment

source: JupyterHub documentation: Setting up JupyterHub

Generate a JupyterHub configuration file^*

make config.yaml

This will create a config.yaml by populating fields of config.TEMPLATE.yaml with the pre-set deployment variables^† and values specified in your .env file.

^{*
Anything generated here will be overwritten by the following deployment
step with the most recent values, but this step is here for completion's sake.}

Deploy JupyterHub to your Kubernetes cluster

Once you've verified config.yaml contains the correct information, on the command line, run:

# Note: the `latest` tag is used here for expediency. When possible, you should
# pin your version by specifying an exact Docker image tag,
# e.g., `TAG=v20201007-7890c25`
make deploy TAG=latest

This will deploy the JupyterHub instance to your cluster via the official Helm chart, parametrized by pre-set deployment variables^† and the config.yaml file you generated in the previous step.

^{† to override a pre-set deployment variable, simply edit the appropriate value in Makefile.}

A note on built-in image tag logic

The makefile defaults to strong versioning of image tags (derived from Google's Kubeflow Central Dashboard Makefile) for unambiguous container image provenance.

Unless you are pushing and pulling to your own registry, you MUST override the generated tag with your desired tag when deploying to your own cluster.

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
.github		.github
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
config.TEMPLATE.yaml		config.TEMPLATE.yaml

License

TeoZosa/jupyterhub-fastbook

Folders and files

Latest commit

History

Repository files navigation

JupyterHub-Fastbook

Taking fast.ai's Practical Deep Learning for Coders course notebooks repository and putting it into a Docker container.

For those that lead a team, scale out by deploying the environment to multiple users at once via JupyterHub, hosted on your own Kubernetes cluster.

Quickstart

Note: This will automatically pull the image from Docker Hub if it is not already present on your machine; it is fairly large (~5 GB), so this may take awhile.

Important: When running the fast.ai notebooks, be sure to switch the notebook kernel to the fastbook environment

Deploying JupyterHub to Your Kubernetes Cluster

Please see the unabridged Kubernetes deployment section for an in-depth explanations of the below steps

You should then be greeted by a Helm messages similar to the below

Check that all the pods are running

Get the JupyterHub server address

Type the IP from the previous step into your browser, login*, and you should now be in the JupyterLab UI! 🎉

Important: When running the fast.ai notebooks, be sure to switch the notebook kernel to the fastbook environment

Overview

Benefits

Immediately get started on the fast.ai Practical Deep Learning for Coders course without any extra setup via the JupyterHub-Fastbook Docker image[1]

Deploy JupyterHub (with the JupyterHub-Fastbook Docker image):

Roll your own JupyterHub deployment:

Example Uses

Why This Project?

Technical Notes

Advanced Usage

Makefile Overview

Available rules

Build and Push Your Own Docker Image

Enabling GitHub Oauth[2]

Determine your JupyterHub host address (the address you use in your browser to access JupyterHub) and add it to your .env file

Generate your GitHub Oauth credentials and add them to your .env file

Redeploy your JupyterHub instance

Setup

Generate a secret token for your JupyterHub deployment and place it in your local .env file

DANGER! DO NOT VERSION CONTROL THIS FILE!

Install Helm

Deployment

Generate a JupyterHub configuration file*

Deploy JupyterHub to your Kubernetes cluster

A note on built-in image tag logic

fast.ai: A non-profit research group focused on deep learning and artificial intelligence.

Jupyter Notebook: An open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text.

JupyterHub: A multi-user version of the notebook designed for companies, classrooms and research labs

Anaconda (conda for short): A free and open-source distribution of the Python and R programming languages for scientific computing, that aims to simplify package management and deployment.

Docker: A set of platform-as-a-service products that use OS-level virtualization to deliver software in packages called containers.

Kubernetes: An open-source system for automating deployment, scaling, and management of containerized applications.

Disclaimer

About

Topics

Resources

License

Stars

Watchers

Forks

Languages

Important: When running the fast.ai notebooks, be sure to switch the notebook kernel to the `fastbook` environment

Type the IP from the previous step into your browser, login^*, and you should now be in the JupyterLab UI! 🎉

Important: When running the fast.ai notebooks, be sure to switch the notebook kernel to the `fastbook` environment

Immediately get started on the fast.ai Practical Deep Learning for Coders course without any extra setup via the `JupyterHub-Fastbook` Docker image^[1]

Deploy JupyterHub (with the `JupyterHub-Fastbook` Docker image):

Enabling GitHub Oauth^[2]

Determine your JupyterHub host address (the address you use in your browser to access JupyterHub) and add it to your `.env` file

Generate your GitHub Oauth credentials and add them to your `.env` file

Generate a secret token for your JupyterHub deployment and place it in your local `.env` file

Generate a JupyterHub configuration file^*

`fast.ai`: A non-profit research group focused on deep learning and artificial intelligence.

`Jupyter Notebook`: An open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text.

`JupyterHub`: A multi-user version of the notebook designed for companies, classrooms and research labs

`Anaconda` (`conda` for short): A free and open-source distribution of the Python and R programming languages for scientific computing, that aims to simplify package management and deployment.

`Docker`: A set of platform-as-a-service products that use OS-level virtualization to deliver software in packages called containers.

`Kubernetes`: An open-source system for automating deployment, scaling, and management of containerized applications.