[AstroPy] new pyfits version deletes NP_pyfits, breaking pickle

Mark Sienkiewicz sienkiew@stsci....
Fri Nov 12 09:27:14 CST 2010


Perry Greenfield wrote:
> Hi Joe,
> 
> We'll look into it. This is a general problem with pickles (and one  
> reason I've been hesitant to avoid using them like save files). I  
> wonder if there is a better solution than that. In this case we had to  
> clean out the previous numarray interface.


The pickle module is flawed because it stores internal structure of an object that is not part of the object's defined interface.  The flaw is inherent in the design of the python pickle module; it is impossible to fix.

A more reliable pickling system works by making a common format for specifying objects, coupled with a custom pickler/unpickler for each data type that might be stored.  The pickle converter needs to know about the object being pickled, but that knowledge is not stored in the pickle file.  This method lacks some of the magic of the python pickler, but it is also less fragile.

One obvious solution is to use a standard pickle format like XML or JSON.  JSON is particularly attractive if you can convert your object into a dictionary or list that contains only dictionaries, lists, and strings.  (The json library can convert this data structure to a string (or back) in a single call.)

The whole point, though, to make a pickler that knows how to convert the object without storing knowledge of the objects internal structure.  The pickle format (including the _names_ and _meanings_ of every field) must be well-defined.  If it is, you can have picklers/unpicklers that make the pickle files work transparently with different libraries.

So, for example, you could pickle a FITS header object as an ordered list of tuples, where each tuple is (name, value, comment).  With this format, it should be possible to convert a FITS header to/from JSON using only the public interfaces in Pyfits.  Public interfaces change less often than internal implementation, and so you are somewhat insulated from the current problem.

Mark S.



More information about the AstroPy mailing list