Add a Contents Manager

JupyterHub has a pluggable storage API (called a contents manager) for persisting notebooks and files. Without a contents manager users will be able to create notebooks, but their notebooks won’t persist between sessions. Note that the contents managers are for storing relatively small files (notebooks, scripts, etc…) and not large files (datasets, etc…).

Storing Notebooks on HDFS

Notebooks can be persisted to HDFS using the jupyter-hdfscm package.

To enable, first install jupyter-hdfscm in the notebook environment not the JupyterHub environment.

# Install in the notebook environment
$ conda install -c conda-forge jupyter-hdfscm

Then add the following to your jupyterhub_config.py file. This forwards the contents manager configuration to the notebook process started by YarnSpawner.

# Enable jupyter-hdfscm
c.YarnSpawner.args = ['--NotebookApp.contents_manager_class="hdfscm.HDFSContentsManager"']

For more information see the jupyter-hdfscm documentation.

Other Options

As with authentication, you have several options for Contents Managers. A few other options: