[SciPy-User] Pylab - standard packages

Nathaniel Smith njs@pobox....
Fri Sep 21 16:43:50 CDT 2012

Hi Fernando,

Excellent rant. To be clear, I have no objection at all to the idea of
supporting notebooks in "pylab" -- my only concern is from the other
direction entirely, that if the "pylab" idea is going to make a
difference it will be because it gets people thinking and talking
about how to work together towards shared goals that they already
have, and by documenting and clarifying existing consensus. The
IPython notebook has tons of buzz, you should obviously be proud of
it, and maybe by next year it'll seem ridiculous to everyone that you
would ever start a new Python user on anything else. But empirically,
that's not true yet, and the way to get there is for you guys to
continue kicking ass, not for "pylab" to legislate something. Trying
would alienate people. So it's just a process and scope objection,
nothing to do with the notebook idea at all.


On Fri, Sep 21, 2012 at 9:38 PM, Fernando Perez <fperez.net@gmail.com> wrote:
> Warning: what follows is a highly opinionated, completely biased post.
>  I'll be using a 'we' that refers to the IPython developers because
> the credit for much of what I talk about goes to the whole team, but
> ultimately the rant is my responsibility, so flame me if need be.
> self.put_hat(kind='IPython').
> I think it's important to address directly the question of the IPython
> notebook.  I realize that not everybody uses it, and it has some extra
> dependencies (though they are really easy ones to satisfy).  But I
> also think it's an important discussion that goes to the question of
> whether we simply are trying to play catch-up what matlab/R-Rstudio
> offer, or to be truly forward-looking and rethink how scientific
> computing will be done for the coming decade.  Needless to say, I have
> little interest in the former and am putting all my energy into the
> latter: if it were otherwise, I'd been contributing to Octave for the
> last 10 years instead.
> My argument, in short: we should consider *some* notebook-type tool as
> a first-class citizen of this effort, for the simple reason that such
> an approach is one whose time has come.  A notebook environment is the
> only tool that truly tackles in an integrated manner the problem that
> we've been referring to as the 'lifecycle of a scientific idea'
> (https://speakerdeck.com/u/fperez/p/ipython-tools-for-the-lifecycle-of-research-computing?slide=3).
> Context: all disciplines are becoming intensely computational, the
> need for real-time collaboration on live computational analysis is
> great, the pressures for moving towards truly reusable, reproducible
> work are coming from multiple angles (major journals, funding
> agencies, ...), we need a much smoother transition between analysis
> codes and publications, and we need better ways to share our analysis
> work over the internet, for education and for archival purposes.
> Having a good IDE is a really important point, and my hat is off to
> the stellar work the Spyder team has done (and coincidentally, another
> Colombian physicist, Carlos Córdoba, is leading the charge on the
> spyder/ipython integration work) .  But to be blunt, a matlab-style
> IDE does not tackle the important questions above in any meaningful
> way.
> In the last decade's worth of the pylab world (using our new moniker
> in its intended fashion), we've certainly taken inspiration from the
> major systems out there, but it has always been that: *inspiration*,
> never simple copying:
> - John Hunter's brilliance with matplotlib was not so much to copy the
> high-level API and look/feel of plot windows to ease the transition
> from matlab.  It was to rethink the question of what a plotting
> library should be, abstracting over GUI toolkits and an elegant OO
> architecture underneath the familiar scripting interface.
> - Numpy's arrays are similar to matlab/fortran ones, obviously, but
> when used with the full power of slicing, fancy indexing and
> structured dtypes, they make matlab's look like the 1970's relic they
> are.  Jim Hugunin, Perry and Travis led the way to build something
> that has no match.
> - The one-man army that is Wes McKinney had R's DataFrame squarely in
> his sights when he built pandas, but he went far, far beyond the basic
> ideas in R to provide one of the most powerful packages we've seen in
> recent memory.
> - etc... you get my point.
> Now, as I said above, the scientific computing world is changing, and
> more importantly, a lot of things in the broader scientific world are
> also undergoing very drastic changes: the push for open access, data
> sharing and reproducibility of results is likely to make a lot of
> things look very different in 10 years than they do now.  We can argue
> that the whole online education wave of Coursera/Udacity/EdX is a bit
> of a bubble, but there's no denying the internet will play a role in
> how scientists are trained both in and out of traditional academia.
> I argue that, after having spent the last decade building up the pylab
> foundations to be competitive with the 'big boys', we are uniquely
> well positioned to stop following and actually lead on many of these
> problems.  And for that, my contention is that it is absolutely
> necessary to have:
> - A tool that bridges the gaps between exploratory work,
> collaboration, production, publication and education.
> - An open format for sharing, publishing and archiving executable
> computational work.
> - A system that is accessible through the browser, so that computation
> can be located where the data is, since we can't move the data to the
> desktop anymore.  Remote collaboration also is most sensibly tackled
> via a browser, as google docs has amply demonstrated.
> Up until now I have *not* said that we should use the *IPython*
> notebook.  Our efforts on this front are, I am sure, full of
> limitations and imperfections.  But if we're not going to tackle the
> problems above, I would like it to be with an explicit decision on
> whether it is because:
> 1. this community only wants to stick to a traditional
> shell+editor/IDE approach.
> 2. the IPython solution is the wrong one, it has technical flaws, etc.
> If it's #1, I think it would be a huge, huge mistake and one of lack
> of foresight, ambition and vision.  If that's the decision, I'm sure
> that we in the IPython team will simply continue fighting for that
> vision on our own, as we are pretty convinced it's the right thing to
> do.  And evidence is mounting that others think the same too:
> - Michigan State University is teaching *two* courses on advanced
> genomics that are heavily notebook based:
> http://ged.msu.edu/angus/beacon-2012/index.html,
> https://github.com/ngs-docs/ngs-notebooks.
> - At Berkeley we have (but this is not driven by me) both an intensive
> bootcamp and a semester-long course on scientific python with the
> same:
> https://github.com/profjsb/python-bootcamp,
> https://github.com/profjsb/python-seminar.
> - We can now blog straight off the notebook
> (http://blog.fperez.org/2012/09/blogging-with-ipython-notebook.html),
> and Jose Unpingco is effectively writing a full book on signal
> processing as a series of blog posts that are notebooks:
> http://python-for-signal-processing.blogspot.com.
> - there's more, just google it.
> Now, if the reluctance is to go with the *IPython* notebook, then I'd
> like to know what the alternative is.  We have effectively put 10
> years of work into this problem, and the current implementation is the
> third or fourth attempt
> (http://blog.fperez.org/2012/01/ipython-notebook-historical.html).  We
> know it's by no means perfect, but honestly I think it would be a lot
> more sensible to fix whatever our limitations are than to start yet
> once more from scratch.  So by all means beat on the format, work with
> us to improve it so it meets your needs, let us know what's wrong with
> it or help us improve the tooling around it (ipython itself, the
> nbconvert tools, the nbviewer.ipython.org site, etc...).  But to be
> blunt, please don't think that ignoring 10 years of work on this
> problem is the right approach.
> In summary, I think that sticking to a shell+editor/IDE view of the
> problem would be missing a huge opportunity to play a key role in
> shaping the next decade's worth of scientific computing. And by the
> way, it's not like the others are standing still here:
> -  Wolfram is busy at work promoting a closed, highly proprietary idea
> (http://www.wolfram.com/cdf-player).
> - Matlab is building a solution around Microsoft Word:
> http://www.mathworks.com/help/matlab/matlab_prog/create-a-matlab-notebook-with-microsoft-word.html.
>  They have a huge market share and resources, so they can and will
> push pretty deep with this.
> - The R community has rapidly banded behind knitr (http://yihui.name/knitr).
> If the pylab community decides to not tackle this problem (and
> opportunity!) head-on, at least from IPython we will continue.  I
> currently have 5 grants in the pipeline all of which would provide, if
> funded, some measure of support for this kind of work.  We all know
> funding is a crap shoot, but even if only some of them go through we
> should have a decent amount of resources not only for our (this
> includes Brian, who's also involved with several) own time but also
> for students, postdocs and developers, to tackle this.  And I simply
> view it as too important not to continue fighting in this direction.
> Now, after all this rant, I want to make clear that I'm *not* saying
> that we should stop talking about the simple shell or that everyone
> should switch to *only* using notebooks.  One important property of
> the IPython notebooks is that it is very easy to generate a  pure .py
> script out of any notebook, any time (and we know how to improve those
> conversion facilities quite a bit).  So even if a project decides to
> ship all of its examples as notebooks, it's trivial to ensure that
> they are also accessible in pure script form to be run from the
> command line or loaded into spyder/IDLE/etc as well as converted to
> clean html in the sphinx-built documentation.
> Furthermore, the notebook is not the tool for building large-scale
> library code, so there will always be a place for
> emacs/vim/textmate/spyder, where the focus is more on the
> 'development' than the interactive exploration/analysis.
> But having notebooks in the projects, once we also build tools for
> cross-project help indexing, will let us provide users with powerful
> help that can search for a term across all the installed
> pylab-compliant tools and will give one-click access to live,
> executable examples they can modify immediately.  Mathematica has had
> this for over a decade and it is absolutely extraordinary.  The same
> tools can also index the pure .py versions, of course, but after 5
> years of not having a Mathematica license, I still miss this every
> time I have to trawl multiple online galleries looking for something
> in the pylab world.
> OK, I doubt anyone is reading by now, so I'll stop here...  Flame away.
> f
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user

More information about the SciPy-User mailing list