[SciPy-User] Distributed computing: running embarrassingly parallel (python/c++) codes over a cluster

Luis Pedro Coelho lpc@cmu....
Thu Nov 12 08:37:36 CST 2009


Rohit Garg wrote:
> I have an embarrassingly parallel problem, very nicely suited to
> parallelization. 

I have lots of those :)

> My only constraint is that it should be able to run a python extension
> (c++) with minimum of fuss. I want to minimize the headaches involved
> with setting up/writing the boilerplate code. Which
> framework/approach/library would you recommend?

My own: It's called jug. See

http://luispedro.org/software/jug

(
Or download the code from github:
http://github.com/luispedro/jug
)

*

It works with any set of processors that can either share a filesystem (plays 
well with NFS, but can be slow) or a connection to a redis database (which is 
very easy to set up and is probably as fast as any other approach if everyone 
is on the same processor).

A major advantage is that you write mostly Python (and not something funny 
looking). For example, here's what a programme with that framework would look 
like:

@TaskGenerator
def preprocess(input):
   ...

@TaskGenerator
def compute(input, param):
    ...

@TaskGenerator
def collect(inputs):
    ...

results = []
for input in glob('*.in'):
	intermediate = preprocess(input)
        results.append(compute(intermediate, param))        
final = collect(results)

The only step that's different w.r.t. to the linear version is adding the 
TaskGenerator decorator, which changes a call of preprocess(input) into 
Task(preprocess, input).

Jug handles everything else.

I have been using this now for almost year for all my research work and it 
works very well for me.

HTH,
Luis
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part.
Url : http://mail.scipy.org/pipermail/scipy-user/attachments/20091112/f8730150/attachment.bin 


More information about the SciPy-User mailing list