Sat Jun 9 16:31:49 CDT 2012
On Fri, Jun 8, 2012 at 8:49 AM, Bob McElrath <email@example.com> wrote:
> Well I'll just say that IPython and matplotlib have quickly become my primary
> computational tool. So I would not simply call it an "exploratory" tool. I
> could already do exploratory computing with the (i)python command line. The
> notebook allows me to save, repeat/reproduce, and communicate a computation. In
> particular it allows me to record how I produced the final version of a plot.
Absolutely! This is how I've been pitching things to scientific
colleagues lately... IPython is an envrionment for the full lifecycle
of computationally-based scientific work, which can be simplified for
the sake of discussion to the following phases:
1. Individual exploratory work: play with some code, some data, some
algorithms, on a problem you have some ideas about. I used to do this
with ipython(terminal) + a good text editor, others used more IDE-type
systems like Spyder, but even here the notebook is already a major
step forward for the reasons you point above.
... Let's say the above exploration is promising, typically you then move to...
2. Collaboration with colleagues. We used to email scripts or share
version control repos. Now the notebook lets us collaborate *live* by
inviting a colleague onto the same notebook and working together on a
common computation. My most recent two projects (papers in review or
in progress) have been developed in this way, and I can't imagine
... if things continue to work out, then you're likely to need to move
to a parallel/larger environment for production runs...
3. Parallel production work: we support local parallelism,
cluster/supercomputing environments and the cloud (Amazon and Azure).
The two papers mentioned above were one done in the Amazon cloud and
the other on MPI environments including both beefy 16-core servers and
... if your results do pan out, you'll want to publish them!...
4. The same notebook can then be exported to latex or html for
publication. We're not really at the point of the notebook being what
you send to a journal, but the notebooks right now are perfect as
supplementary materials that remain *executable*. Send a PDF for
others to read, put up the ipynb for them to execute, hopefully along
with the machine image that has the actual code and data. Again, this
is already a reality:
Titus Brown did it before us: http://ged.msu.edu/papers/2012-diginorm/
A few weeks later we had a similar one:
... Hopefully the material is good enough not just for a paper, but
you actually want to teach it further...
5. The same notebooks can be used for live education. I've started to
see lots of repos on github popping up with notebooks as tutorials,
and again Titus is leading the charge here:
With the traditional approaches we've mostly used so far, each of the
steps above typically required shifting tools completely. That hurts
productivity, kills reproducibility, and fundamentally impairs your
ability to *iterate* on an idea. We've all seen it: the figures look
good, write up the paper, quick! Forget about trying to rerun the
code later if a question arises, just fix up the figure in
This is a horrific way to work and yet is how most of today's
computational work is done. We're trying to change that.
> I have been cursing the capabilities, failures, and crashes of Maple and
> Mathematica for more than a decade now. One of the major things where both of
> those packages fall down is plotting. For years I have been exporting data to
> flat files and importing it into xmgrace, because the quality of their plots was
> so poor. Matplotlib is substantially better than any of the above. So, I will
> be producing my plots using it from now on. (And I will undoubtedly be
> exporting data from Maple into an IPython notebook for plotting purposes soon)
> I see the IPython notebook as a replacement for Maple, for me.
I had shifted from Maple to Mathematica long ago, but that path is
very similar to ours.
> The interface for creating matplotlib plots is no worse than Gnuplot, Maple, or
> Mathematica, in terms of having to construct the plot using a script. There is
> a lot of room for improvement there. What I would like is to eventually have a
> GUI-like way to do simple manipulation of plots, along the lines of xmgrace and
Absolutely. It's been years since I've used xmgrace, but it did have
some very interesting features.
> Anyway, it's a long term goal. For now I'm just going to try to get a
> reasonable vector-based rather than rasterized output for the notebook. This is
> required so that different screen/browser sizes, font sizes look reasonable.
Yup, and we greatly appreciate the energy you've put into it!
>> But I just fail to see why plots should be resizable, and why plots should be
>> able to emit python code for tracking the changes.
> Personally I do not find zooming/panning terribly useful. If I need to
> zoom/pan, it always means either
> 1. I will have to re-plot to pan anyway
> 2. I will have to recompute at higher resolution so the zoom/pan is smooth
> About the only thing I do with the gtk matplotlib GUI that is useful is change
> the margins.
Zooming and panning in matplotlib (in the interactive gui windows) can
be very useful for analyzing long time series, for example. But
that's obviously zooming with replot, not just pixel-boosting.
> The things that are easier to do with an interactive plot are logged axes, tick
> mark spacing, line type/color, axis labels, legend, title, grid, 3d rotation.
> None of which you can do with the matplotlib gui.
Matplotlib does 3d axis rotation, and can toggle log/linear axes (not
sure what you meant by the word 'logged' axes), but you're right that
it doesn't do the others.
In any case, we have a ton of work ahead of us on this front. Glad to
have more hands on deck!
More information about the IPython-dev