[IPython-user] Trivial Parallelization (Redux)

Fernando Perez fperez.net@gmail....
Thu Dec 11 02:35:50 CST 2008


Hi Frank,

On Wed, Dec 10, 2008 at 11:45 PM, Frank Horowitz
<Frank.Horowitz@csiro.au> wrote:
> Hi All,
>
> I'm having a little trouble getting past the "first usage" hurdle of
> IPython parallelization. (This is likely a FAQ.) It's closely related
> to Jose Gomez-Dans' thread from earlier this month, but with an added
> wrinkle.
>
> I'm trying to use TaskClient.map() to do its job, but the added
> wrinkle beyond Jose's case is that my "function" is actually a bound
> method of an object rather than a pure function.  The TC code throws a
> TypeError, complaining that a task function must be a FunctionType.
> (Quite true, a bound method is not a pure FunctionType...)
>
> Is there any easy workaround, perhaps via defining a doIt() type
> function that was mentioned in that previous thread? If so, I've not
> been able to discover it...
>
> TIA for any help you might be able to provide!

I think we should work  on making this api somewhat cleaner, but
here's a first stab at a simple solution.  The key is to understand
how the mechanism works: for your engines to execute a call, they need
to be able to have in memory an instance of the object they will call,
whether it's a function or an instance method.  The map methods know
how to serialize 'on the go' a pure function, but they currently don't
do the same for class instances.  While we could add that, there are
issues with unserializing the instance at the other end, state
handling, etc.

So the following example follows what is a bit more verbose, but
perhaps safer approach: split the problem in two files, put the
classes you want in one module that  all engines can import, and then
create the callable objects directly on the engines.  Once they have
been created, you can use a simple lambda to wrap the call you need.
The code is easy:

In [18]: cat simpleclass.py
class F(object):
    def doit(self,x):
        return x**2

In [19]: cat simpletask.py
from IPython.kernel import client

# Get two handles on the same group of engines
mec = client.MultiEngineClient()
tc = client.TaskClient()

# Use the direct execution one to 'prime' the engines with the objects we need
mec.execute("""
from simpleclass import F
fancyobject = F()
""")

# Now, use the load-balanced tc to scatter calls to .doit() over a range.  Note
# that this lambda refers to a *remote* name, 'fancyobject' that we created on
# the engines above.  This lambda is unpacked remotely and excuted on the
# engines.
print 'Remote execution:',tc.map(lambda x: fancyobject.doit(x),
                                 range(10))

In [20]: run simpletask.py
Remote execution: [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]



Let us know if this helps, and then we'll add this to the docs if you
find it useful.

Cheers,

f


More information about the IPython-user mailing list