[Numpy-discussion] Re: Meta: too many numerical libraries doing the same thing?

Joe Harrington jh at oobleck.astro.cornell.edu
Sat Nov 24 19:14:02 CST 2001

Yes, this issue has been raised here before.  It was the main
conclusion of Paul Barrett's and my BOF session at ADASS a 5 years ago
(see our report at
http://oobleck.astro.cornell.edu/jh/ast/papers/idae96.ps).  The main
problems are that we scientists are too individualistic to get
organized around a single library, too pushed by job pressures to
commit much concentrated time to it ourselves, and too poor to pay the
architects, coders, doc writers, testers, etc. to write it for us.
Socially, we *want* to reinvent the wheel, because we want to be
riding on our own wheels.  Once we are riding reasonably well for our
own needs, our interest and commitment vanishes.  We're off to write
the next paper.

Following that conference, I took a poll on this list looking for help
to implement the library.  About half a dozen people responded that
they could put in up to 10 hours a week, which in my experience isn't
enough, once things get hard and attrition sets in.  Nonetheless, Paul
and I proposed to the NASA Astrophysics Data Analysis Program to hire
some people to write it, but we were turned down.  We proposed the
idea to the head of the High Energy Astrophysics group at NASA
Goddard, and he agreed -- as long as what we were really doing was
writing software for his group's special needs.  The frustrating thing
is how many hundreds of astronomy projects hire people to do their 10%
of this problem, and how unwilling they are to pool resources to do
the general problem.

A few of the volunteers in my query to this list have gone on to do
SciPy, to their credit, but I don't see them moving in the direction
we outlined.  Still, they have the capacity to do it right in Python
and compiled code written explicitly for Python.  They won't solve the
general problem, but they may solve the first problem, namely getting
a data analysis environment that is OSS and as good as IDL et al. in
terms of end-to-end functionality, completeness, and documentation.

I like the notion that the present list is for designing and building
the underlying language capabilities into Python, and for getting them
standardized, tested, and included in the main Python distribution.
It is also a good place for debating the merits of different
implementations of particular functionality.  That leaves the job of
building coherent end-user data analysis packages (which necessarily
have to pick one routine to be called "fft", one device-independent
graphics subsystem, etc.) to application groups like SciPy.  There can
be more than one of these, if that's necessary, but they should all
use the same underlying numerical language capability.

I hope that the application groups from several array-based OSS
languages will someday get together and collaborate on an ueberlibrary
of numerical and graphics routines (the latter being the real sticking
point) that are easily wrapped by most languages.  That seems
backwards, but I think the social reality is that that's the way it is
going to be, if it ever happens at all.


More information about the Numpy-discussion mailing list