[IPython-User] IPython cluster: stopping

Ariel Rokem arokem@gmail....
Fri Jan 27 13:24:43 CST 2012


Hi everyone,

I am using ipcluster (from a rather recent github master) to run some
resampling/bootstrapping analysis of a rather large MRI dataset. For now, I
am running this locally on an eight core machine (on Fedora). I start by
calling ipcluster start. Everything fires up OK ("Engines appear to have
started successfully") and things seem to be going fine.

To do the calculations I call something like the following sequence:

rc = p.Client()
rc[:].execute('import numpy as np')
... # A few more imports of my own analysis modules

dview = rc[:]

kappa = []

for i in n: # n= [8,16,32,64,128]
    kappa.append(calc_boot(booter, data, n, params, dview))

Where n  is a resampling parameter and the function calc_boot is a wrapper
to the computation. which does some allocation of variables and
reorganization of the outputs and includes the line:

...

this_kappa = np.zeros(kappa_size)
m = 0
while m<B: # B is one of the parameters, how many boot-samples to run
    this = dview.apply_async(booter, data, n, params).get()
    this_kappa += this
    m += len(this)
...

this_kappa/=m
return this_kappa

And booter is the function that does some fitting on the data and
calculates the specific variable kappa, which then gets averaged into the
return variable this_kappa etc. That's the lengthy computation itself on
the data. This seems to work great (and fast!), for a while. Monitoring my
system, I can see that all eight cpus are running at full throttle. Then,
after about half an hour of running, I get a message that IPython cluster
is stopping the engines. Once that happens, everything grinds to a halt.

I don't know if this is relevant, but I noticed that while I was running my
analysis, memory sky-rockets, even though kappa is not such a huge variable
and is only a derived measure from the data. When the IP cluster stops,
memory usage goes back down as well.

Any ideas on how to keep my cluster going?

Thanks!

Ariel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/ipython-user/attachments/20120127/89218b3f/attachment.html 


More information about the IPython-User mailing list