[SciPy-user] future-safe saving of numpy arrays?

David M. Cooke cookedm at physics.mcmaster.ca
Thu Jun 8 15:53:51 CDT 2006


On Thu, 08 Jun 2006 13:50:21 -0600
Travis Oliphant <oliphant at ee.byu.edu> wrote:

> Cory Davis wrote:
> 
> > Hi All,
> >
> > I have had some trouble with my data and changes to numpy over time.  
> > I often want to save both single arrays and complicated objects with 
> > arrays as data members.  Until now I have almost always used cPickle. 
> > But this can cause problems when I upgrade numpy/scipy, when I can no 
> > longer unpickle data saved using older versions.  
> 
> Unfortunately, there were some bugs in the NumPy reduce implementation 
> that required small changes.   Post 1.0, there will not be significant 
> changes made to pickle that cause old pickles not to load (I don't seen 
> any changes happening from now on, frankly).  This is definitely a 
> growing pain of the pre-1.0 release.
> 
> > Does anyone have any suggestions on avoiding this problem?
> 
> A problem with Pickle generally, is that if you pickle objects requiring 
> specific modules, then any name changes in those modules will cause 
> difficulties with loading (most of these problems can be worked around 
> --- often trivially), but it does get to be a pain for long-term 
> persistence with Pickle.   Using PyTables is probably a better idea.

Looking at the pickle output (pickletools.dis is good for this), there's three
names it needs:
- the unpickle function: numpy.core._internal._reconstruct
- numpy.ndarray
- numpy.dtype

The last two names are unlikely to change (although their representations
may). As long as we as developers are careful, the first one is ok too.

I would suggest adding a version number to the ndarray and dtype pickles, so
that if we need to change the format post-1.0, we could still handle the old
ones (or at least warn about them). [and also to the scalar types.] Looks
like this can be added to the current code, while still being able to read
current pickles. I'll add it sometime soon.

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca



More information about the SciPy-user mailing list