[IPython-User] IPython parallel and namedtuple

MinRK benjaminrk@gmail....
Mon Feb 18 14:28:51 CST 2013


On Mon, Feb 18, 2013 at 2:47 AM, John Reid <j.reid@mail.cryst.bbk.ac.uk>wrote:

> Hi,
>
> I think the parallel functionality in ipython used to work with
> collections.namedtuples but does not with version 0.12.1.
>
> For example if I have module parallel_helper.py:
>
> from collections import namedtuple
> Result = namedtuple('Result', 'pid duration')
>
> and a script:
>
> def sleep(secs):
>     import os, time, parallel_helper
>     start = time.time()
>     time.sleep(secs)
>     return parallel_helper.Result(os.getpid(), time.time() - start)
>
> rc = parallel.Client()
> v = rc.load_balanced_view()
> async_result = v.map_async(sleep, range(3, 0, -1), ordered=False)
> for ar in async_result:
>     print ar
>
> then an error is raised when the results are passed back:
>
> RemoteError: TypeError(__new__() takes exactly 3 arguments (2 given))
> Traceback (most recent call last):
>   File
> "/usr/lib/python2.7/dist-packages/IPython/parallel/engine/streamkernel.py",
> line 345, in apply_request
>     packed_result,buf = serialize_object(result)
>   File "/usr/lib/python2.7/dist-packages/IPython/parallel/util.py", line
> 230, in serialize_object
>     clist = canSequence(obj)
>   File "/usr/lib/python2.7/dist-packages/IPython/utils/pickleutil.py",
> line 121, in canSequence
>     return t([can(i) for i in obj])
>   File "/usr/lib/python2.7/dist-packages/IPython/utils/pickleutil.py",
> line 105, in can
>     return canSequence(obj)
>   File "/usr/lib/python2.7/dist-packages/IPython/utils/pickleutil.py",
> line 121, in canSequence
>     return t([can(i) for i in obj])
> TypeError: __new__() takes exactly 3 arguments (2 given)
>
> I think the problem is somewhere in the code that treats lists and
> tuples as special cases in pickleutil.py:
>
> def canSequence(obj):
>     if isinstance(obj, (list, tuple)):
>         t = type(obj)
>         return t([can(i) for i in obj])
>     else:
>         return obj
>
> As far as I can remember namedtuples that were well defined in their own
> modules were perfectly usable across the parallel interface.
>
> Is there any way that I can still use them? I find them extremely useful.
>

Sorry about this.  I've made a pull
request<https://github.com/ipython/ipython/pull/2951>against master
that should fix it.

A patch for stable IPython is a bit messy.  You would have to monkeypatch
IPython.zmq.serialize.serialize_object/unserialize_object to handle
namedtuples specially.

The other (probably better) options in the meantime are:

use a trivial class instead of namedtuple:

class Result(object):
    def __init__(self, **kwargs):
        for key, value in kwargs.items():
            setattr(self, key, value)

r = Result(pid=os.getpid(), duration=time.time() - start)

or do the namedtuple wrapping on the Client side:

def task():
    ...
    return os.getpid(), time.time() - start

amr = view.map_async(task)
for tup in amr:
    r = helper.Result(tup)


Neither of which is awesome, but thanks for the report, the fix should be
in master shortly.

-MinRK


>
> Thanks,
> John.
>
>
>
> _______________________________________________
> IPython-User mailing list
> IPython-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/ipython-user/attachments/20130218/56476a0e/attachment.html 


More information about the IPython-User mailing list