[IPython-User] SAGE notebook vs iPython notebook

Fernando Perez fperez.net@gmail....
Sat Jan 7 22:26:30 CST 2012


Hi,

On Thu, Jan 5, 2012 at 8:06 PM, Oleg Mikulchenklo <olmikul@gmail.com> wrote:
> What is relation and comparison between iPython notebook  and SAGE notebook?
> Can someone provide motivation and roadmap for iPython notebook as
> alternative to SAGE notebook?

Let me try to provide some perspective on this, since it's a valid
question that is probably in the minds of others as well.  This is
just *my* take on it, and other devs are welcome to pitch in as well
with their view.  Apology in advance, this is quite long, but I'm
trying to do justice to many years of development, multiple
interactions between the two projects and the contributions of many
people.  I apologize in advance to anyone I've forgotten (but please
do correct me as I want to have a full record that's reasonably
trustworthy).

Let's go back to the beginning: when I started IPython in late 2001, I
was a graduate student in physics and had used extensively first
Maple, then Mathematica, both of which have notebook environments.  I
also used Pascal (earlier) then C/C++, but those two (plus IDL for
numerics) were the interactive environments that I knew well, and my
experience with them shaped my view.  In particular, I was a heavy
user of the Mathematica notebooks and liked them a lot.

I started using Python in 2001 and liked it, but the interactive
prompt felt like a crippled toy compared to the systems above or to a
Unix shell.  When I found out about sys.displayhook, I realized that
by putting in a callable object, I would be able to hold state and
capture previous results for reuse.  I then wrote a python startup
file to provide these features, giving me a 'mini-mathematica' in
python by also loading Numeric and Gnuplot.  Thus was my
'ipython-0.0.1' born, 259 lines to be loaded as $PYTYHONSTARTUP.  In
case you are curious, I'm attaching it here, it's kind of funny that I
can 'release' IPython 0.0.1 as an email attachment...

I also read an article
(http://onjava.com/pub/a/python/2001/10/11/pythonnews.html) that
mentioned two good interactive systems for Python, LazyPython and IPP.
 I contacted their authors,  Nathan Gray and Janko Hauser, seeking to
join forces to create IPython together.  They were both very gracious
and let me use their code, but didn't have the time to participate in
the effort.  As any self-respecting graduate student with a
dissertation deadline looming would do, I threw myself full-time into
building the first 'real' IPython by merging my code with both of
theirs.  Eventually I did graduate, by the way.

The point of this little trip down memory lane is to indicate that
from day 1, Mathematica and its notebooks (and the Maple worksheets
before) were in my mind as my 'ideal' environment for daily
computational scientific work. In 2005 we had two Google SoC students
and we took a stab at building, using WX, a notebook system.  Robert
Kern then put some more work into the problem, but unfortunately that
prototype never really became fully usable.

In early 2006, William Stein organized what was probably the first
Sage Days at UCSD and invited me; William and I had been in touch
since 2005 as he was using IPython for the sage terminal interface.  I
suggested Robert come as well, and he demoed the notebook prototype he
had at that point.  It was very clear that the system wasn't
production ready, and William was already starting to think about a
notebook-like system for sage as well. Eventually he started working
on a browser-based system, and by Sage Days 2 in October 2006, as
shown by the coding sprint topics
(http://wiki.sagemath.org/sd2-sprint), the sage notebook was already
usable.

Sage going at it separately was completely reasonable and justified:
we were moving slowly and by that point even we weren't convinced the
wx approach would go anywhere. William is a force of nature and was
trying to get sage very usable very fast, so building something
integrated for his needs was certainly the right choice.

We continued working on ipython, and actually had another attempt at a
notebook-type system in 2007. By that point Brian and Min had come on
board and we had built the Twisted-based parallel tools. Using this,
Min got a notebook prototype working using an SQL/SQLAlchemy backend.
Like Sage this used a browser for the client but retained the 'IPython
experience', something the Sage notebook didn't provide.

This is a key difference of our approach and the Sage nb, so it' worth
clarifying what I mean: the Sage notebook took the route of using the
filesystem for notebook operations, so you can't meaningfully use 'ls'
in it or move around the filesystem yourself with 'cd', because sage
will always execute your code in hidden directories with each cell
actually being a separate subdirectory.  This is a perfectly valid
approach and lets the notebook do many useful things, but it is also
very different from the ipython model where we always keep the user
very close to the filesystem and OS.  For us, it's really important
that you can access local scripts, use %run, see arbitrary files
conveniently, as in data analysis and numerical simulation we make
extensive use of the filesystem.  So the sage model wasn't really a
good fit for us.

Furthermore, we wanted a notebook that would provide the entire
'IPython experience', meaning that magics, aliases, syntax extensions
and all other special IPython features worked the same in the notebook
and terminal.  The sage nb reimplemented some of these things in its
own way: they reused the % syntax but it has a different meaning, they
took some of the ipython introspection code and built their own x?/??
system, etc. In some cases it's almost like ipython, in others the
behavior is fairly different, which is fine for Sage but doesn't work
for us.

So we continued with our own efforts, even though by then the Sage
notebook was fairly mature by this time.  For a number of reasons (I
honestly don't recall all the details), Min's browser-based notebook
prototype also never reached production quality.

Eventually, in 2009 we were able to fund Brian to dig into the heart
of the beast, and attack the fundamental problem that made ipython
development so slow and hard: the fact that the main codebase was an
outgrowth of that original merge from 2001 of my hack, IPP and
LazyPython, by now having become an incomprehensible and terribly
interconnected code with barely any test suite.  Brian was able to
devote a summer full-time to dismantling these pieces and reassembling
them so that they would continue to work as before (with only minimal
regressions), but now in a vastly more approachable and cleanly
modularized codebase.

This is where early 2010 found us, and then zerendipity struck: while
on a month-long teaching trip to Colombia I read an article about
ZeroMQ (http://lwn.net/Articles/370307) and talked to Brian about it,
as it seemed to provide the right abstractions for us with a simpler
model than Twisted.  Brian then blew me away, by writing in just two
days a new set of clean Cython-based bindings: we now had pyzmq!  It
became clear that we had the right tools to build a two-process
implementation of IPython that could give us the 'real ipython' but
communicating with a different frontend, and this is precisely what we
wanted for cleaner parallel computing, multiprocess clients and a
notebook.  When I returned from Colombia I had a free weekend and
drove down to his place, and in just two days we had a prototype of a
python shell over zmq working, proving that we could indeed build
everything we needed.

Shortly thereafter, we had discussions with Enthought who offered to
support Brian and I to work in collaboration with Evan Patterson, and
build the Qt console using this architecture.  Our little prototype
had been just a proof of concept, but this support allowed us to spend
the time necessary to apply the same ideas to the real IPython. Brian
and I would build a zeromq kernel with all the IPython functionality,
while Evan built a Qt console that would drive it using our
communications protocol.  This worked extremely well, and by late 2010
we had a more or less complete Qt console working.

In October 2010 James Gao (a Berkeley neuroscience graduate student)
wrote up a quick prototype of a web notebook, demonstrating that the
kernel design really worked well and could be easily used by a
completely different client.  And then in the summer of 2011, Brian
took James' prototype and built up a fully working system, this time
using the Tornado web server (which ironically, we'd looked at in
early 2010 as a candidate for our communications, but dismissed it as
it wasn't really the tool for that job), JQuery, CodeMirror and
MathJax.  That's the notebook that we then polished over the next few
months to finally release in 0.12.

As this long story shows, while it's taken us a very long time to get
here, what we have now makes a lot of sense for us, even considering
the existence of the Sage notebook and how good it is for many use
cases.

Our notebook is just one particular aspect of a much larger and richer
architecture built around the concept of a Python interpreter
abstracted over a JSON-based, explicitly defined communications
protocol (http://ipython.org/ipython-doc/rel-0.12/development/messaging.html).
 Even considering http clients, the notebook is still just one
possible client: you can easily build an interface that only evaluates
a single cell with a tiny bit of javascript like the Sage single cell
server, for example.

Furthermore, since Min also reimplemented the parallel machinery
completely with pyzmq, now we have one truly common codebase for all
of IPython. We still need to finish up a bit of integration between
the interactive kernels and the parallel ones, but we plan to finish
that soon.

We deliberately wrote the notebook to be a lightweight, single-user
program meant to keep its files next to the rest of your scripts and
other files.  The sage notebook draws many parallels with the google
docs model, requiring a login and showing all of your notebooks
together, kept in a location separate from the rest of your files.  In
contrast, we want the notebook to just start like any other program
and for the ipynb files to be part of your normal workflow, ready to
be version-controlled just like any other script or file and easy to
manage on their own.

There are other deliberate differences of interface and workflow:

- We keep our In/Out prompts explicit because we have an entire system
of caching variables that uses those numbers, and because those
numbers give the user a visual clue of the execution order of cells,
which may differ from the document's order.

- We deliberately chose a structured JSON format for our documents.
It's clear enough for human reading while allowing easy and powerful
machine manipulation without having to write our own parsing.  So
writing utilities like a rst or latex converters (as we recently
showed) is very easy.

- Our move to zmq allowed us (thanks to Thomas' tireless work) to ship
the notebook working both on python2 and python3 out of the box.  The
sage notebook only works on python2, and given their use of Twisted it
will be probably some time before they can port to python3.

- Because our notebook works in the normal filesystem, and lets you
create .py files right next to the .ipynb just by passing --script at
startup, you can reuse your notebooks like normal scripts, import from
them, etc.  I'm not sure how to import a sage notebook from a normal
python file, or if it's even possible.

- We have a long list of plans for the document format: multi-sheet
capabilities, latex-style preamble, per-cell metadata, structural
cells to allow outline-level navigation and manipulation such as in
LyX, ... For that, we need to control the document format ourselves so
we can evolve it according to our needs and ideas.

As you see, there are indeed a number of key differences between our
notebook and the sage one, but there are very good technical reasons
for this (in addition to the licensing point already mentioned).  The
notebook integrates with our architecture and leverages it; you can
for example use the interactive debugger via a console or qtconsole
against a notebook kernel, something not possible with the sage
notebook.

I'd like to close by emphasizing that we have had multiple, productive
collaborations with William and other Sage devs in the past, and I
expect that to continue to be the case.  On certain points that
collaboration has already led to convergence; e.g. the new Sage single
cell server uses the IPython messaging protocol, after we worked
closely with Jason Grout during a Sage Days event in March 2011 thanks
to William's invitation.  In the future we may find other areas where
we can reuse tools or approaches.  It is clear to us that the Sage
notebook is a fantastic system, it just wasn't the right fit for
IPython; I hope this very long post illustrates why.

Whew, that was a lot!  I probably should turn this into a blog post at
some point... Don't hesitate to ask questions on this, I promise much
shorter replies in the future :)

Cheers,

f
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ipython-0.0.1.py
Type: application/octet-stream
Size: 8515 bytes
Desc: not available
Url : http://mail.scipy.org/pipermail/ipython-user/attachments/20120107/39588141/attachment-0001.obj 


More information about the IPython-User mailing list