[SciPy-dev] Dataset for examples and license

Anne Archibald peridot.faceted@gmail....
Tue Apr 24 00:53:45 CDT 2007

On 24/04/07, David Cournapeau <david@ar.media.kyoto-u.ac.jp> wrote:
> Hi,
>     I would like to know what should be done when including some dataset
> in scipy ? For example, during the development of my project pymachine,
> I would like to include some famous data like iris/old faithful data,
> etc... for demo of classic machine learning algorithms. R has some
> intereseting data, but is licensed under the GPL, and I am not quite
> sure what the status of the data are wrt the license ? Does GPL also
> cover raw data ?

Not necessarily appropriate for machine learning, and this doesn't
answer your question, but there's lots of astronomy data which is
public (and in fact I think in the public domain as it's a NASA

For inclusion in scipy, supposing the license is fine, if the data is
small (a few kilobytes?) it can go in a test case. (Does scipy *have*
a collection of example code in the distribution? It would be nice...)
If it's bigger (a few megabytes?) it could go on the Wiki; if it's
really big it could probably go on the Wikimedia Commons (though do
they support arbitrary file types?).

Uh, I should say, I'm not a scipy developer, so this is rather my best
guess at what they would permit.


