[IPython-User] IPython cluster: stopping
Fri Jan 27 13:24:43 CST 2012
I am using ipcluster (from a rather recent github master) to run some
resampling/bootstrapping analysis of a rather large MRI dataset. For now, I
am running this locally on an eight core machine (on Fedora). I start by
calling ipcluster start. Everything fires up OK ("Engines appear to have
started successfully") and things seem to be going fine.
To do the calculations I call something like the following sequence:
rc = p.Client()
rc[:].execute('import numpy as np')
... # A few more imports of my own analysis modules
dview = rc[:]
kappa = 
for i in n: # n= [8,16,32,64,128]
kappa.append(calc_boot(booter, data, n, params, dview))
Where n is a resampling parameter and the function calc_boot is a wrapper
to the computation. which does some allocation of variables and
reorganization of the outputs and includes the line:
this_kappa = np.zeros(kappa_size)
m = 0
while m<B: # B is one of the parameters, how many boot-samples to run
this = dview.apply_async(booter, data, n, params).get()
this_kappa += this
m += len(this)
And booter is the function that does some fitting on the data and
calculates the specific variable kappa, which then gets averaged into the
return variable this_kappa etc. That's the lengthy computation itself on
the data. This seems to work great (and fast!), for a while. Monitoring my
system, I can see that all eight cpus are running at full throttle. Then,
after about half an hour of running, I get a message that IPython cluster
is stopping the engines. Once that happens, everything grinds to a halt.
I don't know if this is relevant, but I noticed that while I was running my
analysis, memory sky-rockets, even though kappa is not such a huge variable
and is only a derived measure from the data. When the IP cluster stops,
memory usage goes back down as well.
Any ideas on how to keep my cluster going?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the IPython-User