[SciPy-dev] Starting a datasets package, again

Robert Kern robert.kern@gmail....
Tue Jun 5 17:13:07 CDT 2007

David Cournapeau wrote:
> Hi,
>     Following the recent discussion about datasets, licensing and 
> inclusion in scipy, I sent several email to people I believe to be 
> copyright holders for some data to get their authorization. As I am 
> receiving answers, I would like to start a package for datasets in scipy 
> or scikits. Robert proposed a convention for such packages a few weeks 
> ago: 
> http://projects.scipy.org/pipermail/scipy-dev/2007-April/006981.html. 
> Basically, there would be a package scipydata with subpackages, one per 
> dataset (ala scikits if I understand correctly). When time allow, some 
> utilities for downloading, caching, etc... datasets could be 
> implemented, but I guess that as long as we agree on the interface, this 
> does not be to be done now.

The iris and oldfaithful packages you posted earlier were good. We might want to
fiddle with the metadata later, but what you had is probably sufficient.

> Would it be ok to create such a packages the next few days with the 
> incoming data ? I think that starting the actual package may encourage 
> other people to join the wagon. Concerning the license, if the copyright 
> holder requires to be cited in the sources, is it OK (I am a bit 
> confused because modified BSD does not require to keep the 
> acknowledgments, so I am not sure exactly how to apply it correctly in 
> this case) ?

It would not be okay to put a BSD license on that data. It would be making a
false representation as to the actual terms attached to the data. But that's
fine since they won't be distributed as part of scipy proper anyways and can
have whatever license the authors deem appropriate. Personally, while I mind
distributing non-open source *code* in scikits, I don't mind distributing
non-open source, but redistributable datasets.

We need to figure out a place for these, though. I'm not sure where to put them.

Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco

More information about the Scipy-dev mailing list