[IPython-user] Trivial Parallelization (Redux)

Brian Granger ellisonbg.net@gmail....
Fri Dec 12 12:34:13 CST 2008


It sounds like you are progressing.  I want to help you figure out the
other issues you are having.  Could you file a bug report on each
issue on the IPython launchpad bug tracker:

https://bugs.launchpad.net/ipython

On Thu, Dec 11, 2008 at 10:43 PM, Frank Horowitz
<Frank.Horowitz@csiro.au> wrote:
> Hi Fernando (et al.),
>
> Thanks for the examples!
>
> I've managed to get things to the point where I have executed the
> equivalent of the
> mec.execute() call on the engines for my case. There have been a
> couple of "gotchas" however.
>
> The first "gotcha" is that the triple quoted (multi-line) strings seem
> to be failing for some reason or other in my environment. (Replicated
> in both
> Mac OSX and Linux, BTW.) I needed to execute the initialization code
> inside of a single quoted string, with semicolons separating the
> statements. No big deal.
>
> The second "gotcha" is that I was invoking the two engines (on my
> Core2 Duo boxes) via a command line of "ipcluster -n 2". It turns out
> that this sets the working directory for each engine to whatever
> directory I happened to be in at the time I invoked that command line.
> Once I figured this out ( print mec.execute("import os; print
> os.getcwd()") is your friend! ), setting the path correctly in those
> initializations is straightforward. Probably a PYTHONPATH environment
> variable would help here if it is propagated to the engines, but I
> haven't tested that.
>
> I'm now stuck at the equivalent of your tc.map() call. My code snippet
> looks like:
>
> tc.map(lambda x: fancyobject.chainConvert(x), gcds)
>
> At the time that snippet is executed, the engines hold initialized
> instances of fancyobject, which has a method chainConvert(), and gcds
> is a list of strings.
>
> I get a TypeError with the message:
>
> TypeError" 'str' object does not support item assignment
>
> I'm guessing that having gcds being a sequence of strings, with
> strings being illegal in this api is causing the problem. If this is
> the case, are there any simple workarounds? If not, any suggestions?
>
> Thanks again for your help!
>        Frank Horowitz
>
>
> On 11/12/2008, at 5:35 PM, Fernando Perez wrote:
>
>> Hi Frank,
>>
>> On Wed, Dec 10, 2008 at 11:45 PM, Frank Horowitz
>> <Frank.Horowitz@csiro.au> wrote:
>>> Hi All,
>>>
>>> I'm having a little trouble getting past the "first usage" hurdle of
>>> IPython parallelization. (This is likely a FAQ.) It's closely related
>>> to Jose Gomez-Dans' thread from earlier this month, but with an added
>>> wrinkle.
>>>
>>> I'm trying to use TaskClient.map() to do its job, but the added
>>> wrinkle beyond Jose's case is that my "function" is actually a bound
>>> method of an object rather than a pure function.  The TC code
>>> throws a
>>> TypeError, complaining that a task function must be a FunctionType.
>>> (Quite true, a bound method is not a pure FunctionType...)
>>>
>>> Is there any easy workaround, perhaps via defining a doIt() type
>>> function that was mentioned in that previous thread? If so, I've not
>>> been able to discover it...
>>>
>>> TIA for any help you might be able to provide!
>>
>> I think we should work  on making this api somewhat cleaner, but
>> here's a first stab at a simple solution.  The key is to understand
>> how the mechanism works: for your engines to execute a call, they need
>> to be able to have in memory an instance of the object they will call,
>> whether it's a function or an instance method.  The map methods know
>> how to serialize 'on the go' a pure function, but they currently don't
>> do the same for class instances.  While we could add that, there are
>> issues with unserializing the instance at the other end, state
>> handling, etc.
>>
>> So the following example follows what is a bit more verbose, but
>> perhaps safer approach: split the problem in two files, put the
>> classes you want in one module that  all engines can import, and then
>> create the callable objects directly on the engines.  Once they have
>> been created, you can use a simple lambda to wrap the call you need.
>> The code is easy:
>>
>> In [18]: cat simpleclass.py
>> class F(object):
>>    def doit(self,x):
>>        return x**2
>>
>> In [19]: cat simpletask.py
>> from IPython.kernel import client
>>
>> # Get two handles on the same group of engines
>> mec = client.MultiEngineClient()
>> tc = client.TaskClient()
>>
>> # Use the direct execution one to 'prime' the engines with the
>> objects we need
>> mec.execute("""
>> from simpleclass import F
>> fancyobject = F()
>> """)
>>
>> # Now, use the load-balanced tc to scatter calls to .doit() over a
>> range.  Note
>> # that this lambda refers to a *remote* name, 'fancyobject' that we
>> created on
>> # the engines above.  This lambda is unpacked remotely and excuted
>> on the
>> # engines.
>> print 'Remote execution:',tc.map(lambda x: fancyobject.doit(x),
>>                                 range(10))
>>
>> In [20]: run simpletask.py
>> Remote execution: [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
>>
>>
>>
>> Let us know if this helps, and then we'll add this to the docs if you
>> find it useful.
>>
>> Cheers,
>>
>> f
>
> _______________________________________________
> IPython-user mailing list
> IPython-user@scipy.org
> http://lists.ipython.scipy.org/mailman/listinfo/ipython-user
>


More information about the IPython-user mailing list