Recovering an unopenable iPython notebook

Pocket
Share on reddit

WARNING: This will be totally uninteresting unless you use iPython, and even then no promises!

Lately I’ve taken to developing code in iPython notebooks whenever possible. One thing I love about iPython is that you can turn your code into a sort of interactive guided tour, a structured and documented sequence of code snippets that a viewer can execute anew as they follow along.  This makes them great for meetings and (dare I say it) live demos, because you can demonstrate the code without typing a single keystroke, removing the biggest potential for human error.

The people who love iPython REALLY love iPython.  I even read a blog about someone who was thinking about writing their thesis in it, presumably using a whole ton of the %%latex magic which allows you to render and include snippets of LaTeX.  I’m not about to go that far, but I have been using it for a lot of results analysis and I’m interested in using it to walk other people through my work, for example in meetings.

That said, one morning a few weeks ago I encountered my first issue with iPython: when I tried to load a particular notebook in Firefox, my browser would stall and I would get this timeout message:

ipython error

If I chose ‘Continue’, it wouldn’t resolve and would pop up again a few seconds later.  If I chose ‘Stop’, I could view only the first half of the notebook, which was really not cool because that missing second half had days of work.

So how did this happen?  Well, the couple times I’ve had this problem it was because I had, in the process of debugging something, added a poorly placed print command that may as well have been:

When this happens in a plain console, you’re generally no worse for wear other than the fact that the console will only keep the last x lines of output.  When it happens in iPython, however, all the output is saved to the output cell (as far as I know) and this becomes an overwhelming mass of text that will cause the browser to hang when loading the notebook.  In the long term I need to change my debugging habits because this isn’t going to fly, but in the short term what I need to do is make my notebook usable again.

If you have to load the notebook in order to clear the output and you have to clear the output in order to load the notebook, it’s a chicken vs. egg problem, isn’t it?  Not quite, there’s a way out!  I’ve had this happen a couple times now, so I have a few strategies for fixing it now.  Nothing mind blowing, but if you found this on Google because you’re having the same problem, hopefully this will give you some ideas.

Let’s fix it!

Step 0: Before trying anything, make sure you back up the notebook!  Look for its .ipynb file and squirrel away a copy of it so you won’t need to worry about accidentally corrupting it further.

Step 1:  Don’t panic!  It’s probably not as bad as it looks. Try hitting ‘Continue’ when you get that browser timeout error, if it appears again then try it two or three or five times until you’re sure no amount of continuing will allow the browser to recover on its own.  If it doesn’t recover, move on to step 2If it does recover then don’t just resume working!  You still need to figure out the cause of the problem or else you’ll have to go through that crash-continue process every time you open the notebook.  If your browser manages to load the notebook after some amount of effort, the first thing I’ll do is clear all output, save and close the notebook, and try opening it again.  If that works, correct the code you think caused the problem and run all code again, then save, close, and reopen the notebook.  If you still have problems and can’t nail down their source, you may need step 4.

Step 2: Turns out, .ipynb files are JSON files, which means that you can open them in your favourite editor (I like Notepad++ for general purpose stuff).  Do you suspect you know which cell caused the problem?  Open the notebook in notepad and look for the culprit. If your problem was caused by too much output, do a quick scroll and you should see the single block making up a huge amount of the document.  When you find it, set its output to an empty string, save the file, and try loading the notebook up in the iPython viewer to see if it’s resolved.  If not, revert to your backup and try the next step.

Step 3: Now we start pulling out the scripts.  This one here should be equivalent to clearing all output, though in my case it made the situation worse; rather than freezing up when it tried to load, it gave an error right off the bat which said that it couldn’t even try to open the notebook at all (I wish I could remember the exact error, I can’t seem to replicate it now!).  I’m not quite sure what went wrong, honestly.  Oh well, that’s why we made the backup!  The script was still worth a shot though, since it only takes a minute.

Step 4: This last approach takes a script for manual entry 1little work, but if your notebook is still unopenable by iPython and you’re out of other ideas, you can try to recreate it from scratch.   I don’t mean typing all your code out again, we can do better than that.  I forked that script from step 3 and turned it into a function that prints out and labels all the code and text snippets that you would need in order to ‘retype’ the notebook from scratch.  The function prints out something like what you see on the right.

script for manual entry 2 Paste all the output into a single cell of a new iPython notebook, and go through it using the Split Cell command (Ctrl Shift -) to quickly re-divide it into cells.

 

 

Then go through the code again, thisscript for manual entry 3 time executing a few snippets of code at a time, doing the save-close-reopen routine every so often if you really want to make sure there’s no problem anymore.  Once you know it all works, go through it one final time cleaning up, deleting the divider comments and the # in front of each header title, while reassigning the headers to actually be headers.  Good as new!

Leave a Reply

Your email address will not be published. Required fields are marked *