[SciPy-Dev] Proposal for Scikit-Signal - a SciPy toolbox for signal processing

Gael Varoquaux gael.varoquaux@normalesup....
Tue Jan 3 15:30:24 CST 2012


On Tue, Jan 03, 2012 at 09:37:10PM +0100, Ralf Gommers wrote:
> Integrating code into scipy after initially developing it as a separate
> package is something that is not really happening right now though.

I would look to respectfully disagree :). With regards to large
contributions, Jake VanderPlas's work on arpack started in the
scikit-learn. The discussion that we had recently on integrating the
graph algorithmic shows that such an integration will continue. In
addition, if I look at the commits in scipy, I see plenty that were
initiated in the scikit-learn (I see them, because I look at the
contributions of scikit-learn developers).

That said, I know what you mean: a lot of worthwhile code is just
developed on its own, and never gets merged into a major package. It's a
pity, as it would be more useful. That said, it is also easy to see why
it doesn't happen: the authors implemented that code to scratch an itch,
and once that itch scratched, there are done.

> Example 1: numerical differentiation. Algopy and numdifftools are two
> mature packages that are general enough that it would make sense to
> integrate them. Especially algopy has quite good docs. Not much active
> development, and the respective authors would be in favor, see
> http://projects.scipy.org/scipy/ticket/1510.

OK, this sounds like an interesting project that could/should get
funding. Time to make a list for next year's GSOC, if we can find
somebody willing to mentor it.

> Example 2: pywavelets. Nice complete package with good docs, much better
> than scipy.signal.wavelets. Very little development activity for the
> package, and wavelets are of interest for a wide variety of applications.

Yes, pywavelet is high on my list of code that should live in a biggest
package. I find that it's actually fairly technical code, and I would be
weary of merging it in if there is not somebody with good expertise to
maintain it.

[snip (reordered quoting of Ralf's email)]

> In cases like scikits.image/learn/statsmodels, which are active,
> growing projects, that of course doesn't make sense

Well, actually, if people think that some of the algorithms that we have
in scikit-learn should be merged back in scipy, we are open to it. A few
things to keep in mind:

- We have gathered a significant experience on some techniques relative
  to stochastic algorithms and big data. I wouldn't like to merge in
  scipy too technical code, for the fear of it 'dying' there. Some people
  say that code goes to the Python standard library to die [1] :).

- For the reasons explained in my previous mail (i.e. pros of having
  domain specific packages when it comes to highly specialized features)
  I don't think that it is desirable to see in the long run the full
  codebase of scikit-learn merged in scipy.


> Scipy is getting released more frequently now than before, and I hope
> we can keep it that way.

This, plus the move to github, does make it much easier to contribute. I
think that it is having a noticeable impact.

> or should just go and ask developers how they would feel about
> incorporating their mature code?

That might actually be useful.

Gael

[1]
http://frompythonimportpodcast.com/episode-004-dave-hates-decorators-where-code-goes-to-die


More information about the SciPy-Dev mailing list