[SciPy-User] Pylab - standard packages

Fernando Perez fperez.net@gmail....
Thu Sep 20 20:19:38 CDT 2012


On Wed, Sep 19, 2012 at 4:35 AM, Thomas Kluyver <takowl@gmail.com> wrote:

> Inevitably, yes, we reduce competition to some extent. But I'm not
> sure this is such a bad thing: we want there to be 'one obvious way to
> do things'. Having a lot of alternatives can be confusing for users,
> and it can fragment developer effort, so none of the alternatives are
> as good as they could be.

I think this can be framed by analogy to the python standard library:
yes, it "picks sides"; no, that doesn't prevent competition.  We're
now on the second option-handling package in the stdlib after argparse
out-did optparse enough to be included, there's similar talk of
putting a new regex package in, elementtree went in despite other xml
tools being there, etc.  There's *enormous* value in giving users some
basic guidance, and hopefully as these 'blessed' tools establish
improved interoperability practices, documentation and packaging
guidelines, etc, it will also make the process of incorporating
third-party packages easier.

Obviously, we should clearly indicate when alternatives exist to the
base tools and pointing how they can be a better fit for some
users/tasks (say Chaco instead of MPL if you're building an
interactive app with the traits reactive programming model).

> We have to take sides in some debates - not as a partisan move to
> bolster our favourite projects, but so that users get a coherent stack
> of useful tools. Useful tools have competition, and we can't say every
> alternative is important.

Absolutely.  It's not like this is going to make Google stop working,
so people will always be free to try new things.  But what will happen
is that hopefully we'll develop practices that will make the whole
ecosystem, core packages and external ones, in general integrate
better for end users, with a smoother installation/documentation/usage
experience. The development of this will be driven by the core but the
resulting conventions and tools will be usable by all projects.
Ultimately we want an ecosystem similar to say the R one, where many
(even competing) packages can exist, but there's a clear core to start
from and an easy path for users to bring in new functionality.

>> 3. Define several levels of Pylab:

I think this has been danced around but not really discussed with
enough precision: a clear dividing line should be drawn between "needs
a compiler" and not.  Because the complexities of getting a compiler
off the ground in some platform are not trivial, and the details
change over time, I think that the 'base level' should consist of very
broadly applicable tools that do *not* need a C compiler to be
installed for working.  The 2nd level would require a C compiler, thus
putting Cython (and in the future numba or similar tools if llvm
becomes more widely accepted as the path forward) squarely in that
camp.

I think it would be a huge mistake to make a compiler a requirement
for the base level: not that I'm not a huge fan of Cython and related
tools, but we really need the on-ramp to be a very, very easy one for
newcomers.  And unfortunately, between 32- and 64-bit windows, mingw
vs the MS compilers, the vagaries of Xcode versions on OSX and how to
install it, etc, it's a bag of thorns likely to put many newcomers
off.

I've had for a while this basic 'layering' of the ecosystem in my mind
that I use as a starting point for these conversations:

https://speakerdeck.com/u/fperez/p/1204-biofrontiers-boulder?slide=21

I think if you take all that minus Cython and Mayavi (for dependency
complexity reasons, VTK is a non-triival beast to deal with too), the
rest is a pretty decent core that covers a lot of what a good fraction
of undergraduate courses in the sciences would broadly need.  Not
every last discipline-specific problem is there, but it hits matlab at
all the right points as well as making a very credible case against
the core of R, and I think that's how we should think of it. The base
system should be a very solid replacemement for typical usage of a
base matlab or R installation (which is why I think that the triad of
pandas, statsmodels and sklearn is absolutely essential, given where
science and data analysis are going right now).

Cheers,

f


More information about the SciPy-User mailing list