[SciPy-dev] toward scipy 1.0

David Cournapeau cournape@gmail....
Wed Nov 5 01:56:52 CST 2008


On Wed, Nov 5, 2008 at 9:20 AM, Pierre GM <pgmdevlist@gmail.com> wrote:

> If we're going to reorganize scipy, I'd be pretty much in favor of modularity:
> let me install just the packages I need (scipy.stats, scipy.special,
> scipy.whatever) without bothering about the ones I'll never use.

Yes, that's an ideal world, but it is hard to do in practice with a
finite amount of time.

>
> Breaking scipy into smaller packages sounds a lot like scikits, but is it that
> bad a thing ? Would it make development more difficult ? Would it make
> installation and maintenance more complex ?

Yes. Generating 10 packages instead of one increases the work. It
means each of them can be updated independantly, so you have an
exponential combination of configurations to test. It really is a lot
of work - unless we don't do any QA, that we release packages which
may be broken on some platforms. OTOH, having a big set of packages
means that a single one can postpone the release; there is a balance
to find.

We can propose all kind of methods for better releasing, but IMO, the
uncomforting truth is that we simply lack the man power to do more
than what is already being done. I personally would be much more
confortable with more code put in scipy if it meant that at the same
time, more people would be willing to participate to the task. This
has not been the case; everybody want to spend time coding new
algorithms, new API, etc... Nobody wants to spend time on platform
idiosyncraties, platform specific bugs, etc...

> As long as there's one standard
> for setup.py, things should go OK, shouldn't they ?

Unfortunately, no. It is true for packages which are pure python, more
or less true for packages with only C code and no dependencies, and
not true at all for everything else (including fortran, C++, etc...).
For example, what if you install one package with one version of
BLAS/LAPACK, and another with another ? Crashes, wrong results, a lot
of nasty things.

I think what should follow is a R-like model: a well maintained core,
with code that people are willing to maintain for some time. And then,
hopefully, it can be used by most other packages, including scikits,
without depending too much one of each other. Having an infrastructure
to support this. Everthing else, in the current state of affairs, has
no chance of succeeding, because scipy developers are already
overbooked.

> As you'd have guessed, I'm all in favor of a kind of central repository like
> cran or ctan. Each scikit could come with a few keywords (provided by their
> developer) to simplify the cataloguing, and with a central page it shouldn't
> be that difficult to know what is being developed and at what pace, or even
> just what is available. It might reduce the chances of duplicated code, help
> future developers by providing some examples, and generally be a good PR
> system for scikits/scipy packages... And yes, why not using some kind of
> graphical brwser ?

I think you vastly underestimate the size of this task. It has not
entirely happened yet for python itself, BTW. Distribution problems
are a very difficult, challenging problems. And there is no silver
bullet: it needs a lot of man power, with a lot of not-that-rewarding
tasks.

cheers,

David


More information about the Scipy-dev mailing list