[SciPy-User] Numpy pickle format

David Baddeley david_baddeley@yahoo.com...
Wed Nov 24 17:22:02 CST 2010

Thanks heaps for the detailed reply! That looks like it should be enough info to 
get me started ... I know it's a bit of a niche application, but is there likely 
to be anyone else out there who's likely to be interested in similar 
functionality? Just want to know if it's worth taking the time to think about 
supporting some of the additional aspects of the protocol (eg c/fortran order) 
before I cobble something together -  I wonder if one could wrap JAMA to provide 
some very basic array functionality ...


----- Original Message ----
From: Robert Kern <robert.kern@gmail.com>
To: David Baddeley <david_baddeley@yahoo.com.au>; SciPy Users List 
Sent: Thu, 25 November, 2010 11:21:36 AM
Subject: Re: [SciPy-User] Numpy pickle format

On Wed, Nov 24, 2010 at 16:00, David Baddeley
<david_baddeley@yahoo.com.au> wrote:
> I was wondering if anyone could point me to any documentation for the (binary)
> format of pickled numpy arrays.
> To put my request into context, I'm using Pyro to communicate between python 
> jython, and would like push numpy arrays into the python end and pull 
> I can work with in jython out the other end (I was thinking of a minimal class
> wrapping the std libraries array.array, and having some form of shape property
> (I can pretty much guarantee that the data going in is c-contiguous, so there
> shouldn't be any strides nastiness).
> The proper way to do this would be to convert my numpy arrays to this minimal
> wrapper before pushing them onto the wire, but I've already got a fair bit of
> python code which pushes arrays round using Pyro, which I'd prefer not to have
> to rewrite. The pickle representation of array.array is also slightly 
> (broken) between cPython and Jython, and although you can pickle and unpickle,
> you end up swapping the endedness, so to recover the data [in the Jython ->
> Python direction] you've got to create a numpy array and then a view of that
> with reversed endedness.
> What I was hoping to do instead was to construct a dummy numpy.ndarray class 
> jython which knew how to pickle/unpickle  numpy arrays.
> The ultimate goal is to create a Python -> ImageJ bridge so I can push images
> from some python image processing code I've got across into ImageJ without
> having to manually save and open the files.

|3> a = np.arange(5)

|4> a.__reduce_ex__()
(<function numpy.core.multiarray._reconstruct>,
(numpy.ndarray, (0,), 'b'),

|6> a.dtype.__reduce_ex__()
(numpy.dtype, ('i4', 0, 1), (3, '<', None, None, None, -1, -1, 0))

See the pickle documentation for how these tuples are interpreted:


|12> x = np.core.multiarray._reconstruct(np.ndarray, (0,), 'b')

|13> x
array([], dtype=int8)

|14> x.__setstate__(Out[11][2])

|15> x
array([0, 1, 2, 3, 4])

|16> x.__setstate__?
Type:           builtin_function_or_method
Base Class:     <type 'builtin_function_or_method'>
String Form:    <built-in method __setstate__ of numpy.ndarray object
at 0x387df40>
Namespace:      Interactive
    a.__setstate__(version, shape, dtype, isfortran, rawdata)

    For unpickling.

    version : int
        optional pickle version. If omitted defaults to 0.
    shape : tuple
    dtype : data-type
    isFortran : bool
    rawdata : string or list
        a binary string with the data (or a list if 'a' is an object array)

In order to get pickle to work, you need to stub out the types
numpy.dtype and numpy.ndarray, and the function
numpy.core.multiarray._reconstruct(). You need numpy.dtype and
numpy.ndarray to define appropriate __setstate__ methods.

Check the functions arraydescr_reduce() and arraydescr_setstate() in
numpy/core/src/multiarray/descriptor.c for how to interpret the state
tuple for dtypes. If you're just dealing with straightforward image
types, then you really only need to pay attention to the first element
(the data kind and width, 'i4') in the argument tuple and the second
element (byte order character, '<') in the state tuple.

Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


More information about the SciPy-User mailing list