[SciPy-User] Pylab - standard packages

Nathaniel Smith njs@pobox....
Fri Sep 21 08:01:39 CDT 2012


On Fri, Sep 21, 2012 at 12:53 PM,  <josef.pktd@gmail.com> wrote:
> On Fri, Sep 21, 2012 at 6:52 AM, Thomas Kluyver <takowl@gmail.com> wrote:
>> Thanks Fernando, you've coherently explained what I was trying to say
>> about why we can and should take sides.
>>
>> On 21 September 2012 02:19, Fernando Perez <fperez.net@gmail.com> wrote:
>>> I think this has been danced around but not really discussed with
>>> enough precision: a clear dividing line should be drawn between "needs
>>> a compiler" and not.  Because the complexities of getting a compiler
>>> off the ground in some platform are not trivial, and the details
>>> change over time, I think that the 'base level' should consist of very
>>> broadly applicable tools that do *not* need a C compiler to be
>>> installed for working.  The 2nd level would require a C compiler, thus
>>> putting Cython (and in the future numba or similar tools if llvm
>>> becomes more widely accepted as the path forward) squarely in that
>>> camp.
>>
>> I like this way of drawing a clear, objective distinction between
>> levels. We would still need to work out how to present the different
>> levels to users, but that's something I think we could resolve.
>>
>>> I've had for a while this basic 'layering' of the ecosystem in my mind
>>> that I use as a starting point for these conversations:
>>>
>>> https://speakerdeck.com/u/fperez/p/1204-biofrontiers-boulder?slide=21
>>>
>>> I think if you take all that minus Cython and Mayavi (for dependency
>>> complexity reasons, VTK is a non-triival beast to deal with too), the
>>> rest is a pretty decent core that covers a lot of what a good fraction
>>> of undergraduate courses in the sciences would broadly need.
>>
>> To save people a click, Fernando's tiers look like this:
>>
>> Python
>> ---
>> Numpy
>> ---
>> IPython, Scipy, Matplotlib, SymPy
>> ---
>> pandas, StatsModels, scikits-learn, scikits-image, scikits-image,
>> PyTables, NetworkX
>>
>> That seems like a vision of a much more comprehensive environment than
>> we had been discussing, but all those packages are familiar names at
>> Scipy conferences, and it would inarguably make a much more capable
>> environment out of the box than just numpy+scipy+mpl. If we were to
>> use that as a starting point, would anyone like to argue against
>> including some of those packages?
>>
>> Almar has already spoken against specifying an interface. I'm actually
>> leaning the other way, although I accept that I could be biased by my
>> role in IPython. For introductory tutorials, I think it would be very
>> valuable to have a common interface, so we can describe, say, what to
>> press to run some code. Otherwise, users would be put off by having to
>> try to apply a generalised tutorial to their particular environment,
>> and interpret screenshots that don't match what they see. In
>> particular, the IPython notebook is a very different model from most
>> IDEs.
>
> I think for an out of the box working environment, I would include
> both ipython and spyder.
> One reason is that it requires packaging or instructions for their
> dependencies, especially pyside/pyqt
> ( I have inconsistent versions in my Windows python versions but I'm
> not interested enough to figure out how to get the qtconsole to work
> after each update.)
>
> ipython's popularity is unquestionable, in statsmodels we start to
> include notebooks as part of the documentation
> spyder gives a similar GUI as other packages that Windows user will be
> familiar with (Matlab, Stata without the stats specific menus, ...)

I'm not sure the "pylab brand" can really dictate which interface
actual distributions should include... there are multiple that have
good reasons to exist and aim at somewhat different niches. I doubt
the Python(x,y) folks are going to stop recommending Spyder just
because the IPython folk suggest it :-). My guess is the best good we
can do is by trying to document and push for standardization in the
places where there's consensus. (There's that word again...)

Some ideas:
- No matter which overall interface you use, you will at some point
want a REPL. We should recommend that for a "pylab environment" this
always defaults to an IPython shell. (Spyder for example supports both
the vanilla ">>>" shell and the IPython "In [n]" shell, both accessed
through a menu that confuses *me*, never mind newbies...) This seems
much less controversial than specifying the overall UI, and would
already be very valuable to those of us trying to write docs, because
every time we write down an example we have to pick one! It's a very
visible difference to newbies, and very trivial to fix.

- A "pylab shell" should include some standard, tasteful set of stuff
pre-imported. (I'm guessing this should be less than what you get
right now from "from pylab import *", maybe no more than "import numpy
as np; import matplotlib.pyplot as plt", but I don't have a strong
opinion.)

- Calling matplotlib plotting functions from a "pylab shell" should
Just Work with no further configuration. Of course what that means
exactly depends on the details of the interface (pop up a window? pop
up a tab in your IDE? insert a graph in-line in the output?), but
that's ok.

- I'd actually like to see some guidelines for how installing packages
should work. Obviously we're limited by the Python Packaging Mess(tm),
but honestly some small environment configuration rules could make the
end-user experience *much* better than it is right now. In R, you can
just type install.packages("some-package") at the command prompt, and
it'll just fetch it from their PyPI-equivalent and stick it in a
user-local directory with no fuss. Perhaps a requirement for being a
"pylab shell" is that the distributor has to make sure that a
user-writeable site-packages directory is available and used by
default. More ambitiously, I tend to think that requiring the
availability of a compiler really is a good idea, given that there are
freely distributable ones for all relevant platforms... and some sort
of virtualenv-management tools wouldn't be a bad idea either...

- I'd like to see some simple conventions for things like "given a
package name, here is how you find its full sphinx docs" (to enable
things like "search all the docs of all installed packages"), "given a
function, here is how you find machine-readable example code" (in R,
you can type example(anyfunction) and it will auto-magically
copy-paste a nice example into your prompt right there), "given a
package name, here is how you run its tests". This is more of a
project in that it requires writing some code, not just listing
recommendations on a wiki page somewhere, but maybe "pylab" can be a
banner to inspire people to do that.

Basically I feel like the main value a "pylab brand" can provide is by
finding places where user quality-of-life can be improved by
standardizing and documenting conventions, and then evangelizing those
to package and distribution authors.

-n


More information about the SciPy-User mailing list