.. _reprozip-jupyter: Making Jupyter Notebooks Reproducible with ReproZip *************************************************** **reprozip-jupyter** is a plugin for `Jupyter Notebooks `__, a popular open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. These are valuable documents for data cleaning, analysis, writing executable papers/articles, and more. However, Jupyter Notebooks are subject to dependency hell like any other application -- just the Notebook is not enough for full reproducibility. We have written a ReproZip plugin for Jupyter Notebooks to help users automatically capture dependencies (including data, environment variables, etc.) of Notebooks and also automatically set up those dependencies in another computing environment. Installation ============ You can install *reprozip-jupyter* with pip:: $ pip install reprozip-jupyter Or Anaconda:: $ conda install --channel conda-forge reprozip-jupyter Once successfully installed, you should then enable the plugin for both the client and server side of Jupyter Notebooks:: $ jupyter nbextension install --py reprozip_jupyter --user $ jupyter nbextension enable --py reprozip_jupyter --user $ jupyter serverextension enable --py reprozip_jupyter --user Once these steps are completed, when you start a Jupyter Notebook server, you should be able to see the ReproZip button in your notebook's toolbar. Packing ======= Once you have a notebook that executes the way you want, you can trace and pack all the dependencies, data, and provenance with *reprozip-jupyter* by simply clicking the button on the notebook's toolbar: .. image:: figures/rzj-button.png The notebook will execute from top-to-bottom and *reprozip-jupyter* traces that execution. If there are no errors in the execution, you'll see two pop-ups like this one after the other: .. image:: figures/rzj-running.png *reprozip-jupyter* will name the resulting ReproZip bundle (*.rpz*) as ``notebookname_datetime.rpz`` and save it to the same working directory the notebook is in: .. image:: figures/rzj-pkg.png Note that the notebook file itself (``.ipynb``) is not included in the bundle, so you should share or archive both of those files. The reason is that a lot of services can render notebooks (GitHub, OSF...), and they wouldn't be able to if it was in the RPZ file. Unpacking ========= Now, anyone can rerun the Jupyter notebook, with all dependencies automatically configured. First, they would need to install *reprounzip* and the *reprounzip-docker* plugin (see the :ref:`installation steps `). Second, they need to download or otherwise acquire the ``.rpz`` file and original ``.ipynb`` notebook they'd like to reproduce. To reproduce the notebook using the GUI, follow these steps: 1. Double-click the .rpz file. 2. The first tab in the window that appears is for you to set up how you'd like ReproUnzip to unpack and configure the contents of the *.rpz*. Choose docker as your unpacker, and choose the directory you'd like to unpack into. 3. Make sure the Jupyter Integration is checked, and click Run experiment: .. image:: figures/rzj-setup.png 4. This second table allows you to interact with and rerun the notebook. All you need to do is click 'Run Experiment' and the Jupyter Notebook home file list should pop up in your default browser (if not, navigate to ``localhost:8888``). Open the notebook, and rerun with every dependency configured for you! .. image:: figures/rzj-run.png On the command line, you would: 1. Set up the experiment using *reprounzip-docker*:: $ reprounzip docker setup 2. Rerun the notebook using *reprozip-jupyter*:: $ reprozip-jupyter run 3. The Jupyter Notebook home file list should pop up in your default browser (if not, navigate to ``localhost:8888``). 4. Open the notebook, and rerun with every dependency configured for you!