# Jupyter and Jupytext
Jupyter notebooks are a great way to perform post-processing. They let you interactive explore your data and to quickly iterate by writing small code blocks.

## Launch locally

`torx` comes with a copy of Jupyter installed. If you're running `torx` on your local machine (i.e. laptop), you can launch a jupyter notebook by running

```
source <path to torx>/env/bin/activate
jupyter lab
```

This should open a web-browser (generally `localhost:8888/`) where you can create and open notebooks.

## Use port forwarding

If you're running on a remote machine (i.e. the TOK clusters or Marconi), you can use [port forwarding](https://ljvmiranda921.github.io/notebook/2018/01/31/running-a-jupyter-notebook/). In one terminal window run (where `XXXX` is some 4-digit number)

```
source <path to torx>/env/bin/activate
jupyter lab --no-browser --port=XXXX
```

and then in another terminal run where `YYYY` is another four-digit number

```
localuser@localhost: ssh -N -f -L localhost:YYYY:localhost:XXXX remoteuser@remotehost
```

then open `localhost:YYYY/` in a web-browser on your local machine. This doesn't work for all machines: some don't allow it.

**Notebooks servers on Marconi**

If you experience issues getting a notebook to run on Marconi with the method above, the following sequence appears to be the most consistent (if you experience issues with this, please edit this entry!):

1. In a local terminal, set up a tunnel to Marconi first

   ```
   localuser@localhost: ssh -L localhost:YYYY:localhost:XXXX remoteuser@remotehost
   ```

   As above, `XXXX` will be the port on Marconi that gets linked to port `YYYY` on your local machine.
2. This will provide port forwarding and simultaneously open a terminal on Marconi. **In this same terminal**, launch the server

   ```
   source <path to torx>/env/bin/activate
   jupyter lab --no-browser --port=XXXX
   ```

## Use RVS

If you want to do your analysis on a MPCDF machine, you can use the [remote visualisation service](https://rvs.mpcdf.mpg.de/rv/). You'll first need to click 'Initialise Remote Visualization' for the machine that you want to run on (only needs to be done once), then SSH into the machine and install `torx` (see the [installation guide](installation)) to get a `torx` kernel on the machine. Next, launch a RVS session using the web-interface, and once you've opened a notebook select `torx` in the list of kernels.

Note that, unlike the other two options, this method uses the MPCDF version of Jupyter. This doesn't have `jupytext` installed, so you'll need to manually convert from `.py` to `.ipynb` files (see below). Alternatively, you can run in a terminal on the server

```
module purge
module load anaconda/3/2021.11
pip3 install jupytext --user --upgrade
```

This should add `jupytext` to your RVS environment.

# Jupytext: representing Jupyter notebooks as plain-text

Jupyter notebooks have one significant drawback. They're complicated JSON files with lots of information about when cells were run and what the output looks like. This is useful if you want to open a notebooks in the same state, but it's makes them a pain to review in Gitlab.

Because of this, before sending notebooks to Gitlab, we **convert them into normal Python files** using a Python package called [`jupytext`](https://jupytext.readthedocs.io/en/latest/index.html). You can read the docs and figure out how to use it yourself if you want, but for convenience I wrote a little helper function called `nbsync` (defined in [`infra/nbsync.py`](https://gitlab.mpcdf.mpg.de/phoenix/torx/-/blob/master/infra/nbsync.py)). This applies `jupytext` on the notebooks defined in [`notebooks/notebooks_m.py`](https://gitlab.mpcdf.mpg.de/phoenix/torx/-/blob/master/torx_notebooks/notebooks_m.py). Note that to use this tool, you must first have the `torx` virtual environment activated, by running

```
source <path to torx>/env/bin/activate
```

To convert a python file, say `torx_notebooks/grillix/getting_started.py` which has the key `grillix_getting_started` into a corresponding notebook `notebooks/grillix/getting_started.ipynb`, type

```
nbsync to_nb --key=grillix_getting_started
```

You can then open the `.ipynb` file using Jupyter to edit and execute the code. To write your changes back into the `.py` file you can use

```
nbsync to_py --key=grillix_getting_started
```

You can then **commit the changes to the `.py` file (the `.ipynb` files are ignored)** and push them back to Gitlab.

If you're checking out the repository **for the first time**, you can also execute

```
nbsync to_nb --key=all
```

to convert all of the `.py` files listed in [`notebooks/notebooks_m.py`](https://gitlab.mpcdf.mpg.de/phoenix/torx/-/blob/master/torx_notebooks/notebooks_m.py) into their corresponding `.ipynb` representations (**don't do this on a repository where you've made uncommitted changes to the notebooks**). There's also `nbsync to_py --key=all` to update all `.py` representations, which can help to make sure that you commit all of your changes.

By default, if you don't tell `nbsync` what to do with existing files, it will skip any conversion where the destination file already exists. You can instead choose to `--overwrite` or `--backup` the existing destination file. Overwrite removes the existing file, backup renames it by adding a date-time string to the filename.

Pro-tip: if you do delete some of your work on a `.ipynb`, there's usually a hidden folder `.ipynb-checkpoints` in the same folder as your notebook which **might** have a backup.