[SciPy-User] Numpy pickle format
Robert Kern
robert.kern@gmail....
Wed Nov 24 16:21:36 CST 2010
On Wed, Nov 24, 2010 at 16:00, David Baddeley
<david_baddeley@yahoo.com.au> wrote:
> I was wondering if anyone could point me to any documentation for the (binary)
> format of pickled numpy arrays.
>
> To put my request into context, I'm using Pyro to communicate between python and
> jython, and would like push numpy arrays into the python end and pull something
> I can work with in jython out the other end (I was thinking of a minimal class
> wrapping the std libraries array.array, and having some form of shape property
> (I can pretty much guarantee that the data going in is c-contiguous, so there
> shouldn't be any strides nastiness).
>
> The proper way to do this would be to convert my numpy arrays to this minimal
> wrapper before pushing them onto the wire, but I've already got a fair bit of
> python code which pushes arrays round using Pyro, which I'd prefer not to have
> to rewrite. The pickle representation of array.array is also slightly different
> (broken) between cPython and Jython, and although you can pickle and unpickle,
> you end up swapping the endedness, so to recover the data [in the Jython ->
> Python direction] you've got to create a numpy array and then a view of that
> with reversed endedness.
>
> What I was hoping to do instead was to construct a dummy numpy.ndarray class in
> jython which knew how to pickle/unpickle numpy arrays.
>
> The ultimate goal is to create a Python -> ImageJ bridge so I can push images
> from some python image processing code I've got across into ImageJ without
> having to manually save and open the files.
[~]
|3> a = np.arange(5)
[~]
|4> a.__reduce_ex__()
(<function numpy.core.multiarray._reconstruct>,
(numpy.ndarray, (0,), 'b'),
(1,
(5,),
dtype('int32'),
False,
'\x00\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00\x04\x00\x00\x00'))
[~]
|6> a.dtype.__reduce_ex__()
(numpy.dtype, ('i4', 0, 1), (3, '<', None, None, None, -1, -1, 0))
See the pickle documentation for how these tuples are interpreted:
http://docs.python.org/library/pickle#object.__reduce__
[~]
|12> x = np.core.multiarray._reconstruct(np.ndarray, (0,), 'b')
[~]
|13> x
array([], dtype=int8)
[~]
|14> x.__setstate__(Out[11][2])
[~]
|15> x
array([0, 1, 2, 3, 4])
[~]
|16> x.__setstate__?
Type: builtin_function_or_method
Base Class: <type 'builtin_function_or_method'>
String Form: <built-in method __setstate__ of numpy.ndarray object
at 0x387df40>
Namespace: Interactive
Docstring:
a.__setstate__(version, shape, dtype, isfortran, rawdata)
For unpickling.
Parameters
----------
version : int
optional pickle version. If omitted defaults to 0.
shape : tuple
dtype : data-type
isFortran : bool
rawdata : string or list
a binary string with the data (or a list if 'a' is an object array)
In order to get pickle to work, you need to stub out the types
numpy.dtype and numpy.ndarray, and the function
numpy.core.multiarray._reconstruct(). You need numpy.dtype and
numpy.ndarray to define appropriate __setstate__ methods.
Check the functions arraydescr_reduce() and arraydescr_setstate() in
numpy/core/src/multiarray/descriptor.c for how to interpret the state
tuple for dtypes. If you're just dealing with straightforward image
types, then you really only need to pay attention to the first element
(the data kind and width, 'i4') in the argument tuple and the second
element (byte order character, '<') in the state tuple.
--
Robert Kern
"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
-- Umberto Eco
More information about the SciPy-User
mailing list