[IPython-User] ipcluster runs out of memory and can't purge results
Fri Dec 14 10:25:43 CST 2012
I think there are a lot of caches, and rc.purge_results('all') doesn't
quite get them all. You might try something like this
def clear_cache(rc, dview):
rc.purge_results('all') #clears controller
assert not rc.outstanding, "don't clear history when tasks are
rc.history = 
dview.history = 
Not sure if this will solve your problem, but I think I've run into a
similar problem, and I've had some success with this.
On Fri, Dec 14, 2012 at 7:05 AM, Johann Rohwer <email@example.com> wrote:
> To follow up on my own post, I have run the script below for a single
> chunk (600 parameter sets). The ipython process then consumes about
> 2.5Gb of memory. I have then deleted the rc, lv, arl, ds and hdf
> objects in the ipython console where I ran the script. I have also
> imported gc and ran gc.collect(). However, the ipython process still
> uses 2.5Gb of memory. Only when I quit the ipython session is the
> memory freed.
> (Note that the ipcontroller and ipengine processes are behaving fine.)
> I am really puzzled by this...
> On Friday 14 December 2012 10:49:16 Johann Rohwer wrote:
> > I'm running a very large computation on a 110-node ipcluster with
> > shared home directory and where the enginges are launched with ssh.
> > Basically it's a total of 2349 runs which each produce a (512,50,23)
> > dataset, so the total resulting data will be on the order of 6Gb.
> > The problem I'm experiencing is that the ipython process that runs
> > the client just keeps increasing in memory and in the end kills the
> > machine (the machine running the controller as well as the client
> > has about 5Gb RAM including swap).
> > To get around this I've chuncked the computation so that the memory
> > copes with one of the chunks. The problem is that the memory does
> > not appear to be freed when rc.purge_results('all') is called (see
> > below) and just keeps increasing in size. I'm using HDF5 for data
> > storage. I've also tried playing around with simple jobs just
> > calling one function and then running
> > rc.purge_results('msg_id_of_async_result'), but the result is still
> > listed in rc.results and does not seem to be cleared. What am I
> > doing wrong? Should I perhaps be deleting the rc object after each
> > chunk and re-instantiating it?
> > The essential points of the code producing the problem are included
> > below.
> > --Johann
> > --------------------------- 8-< ------------------------------------
> > from h5py import File
> > from IPython.parallel import Client
> > def Analyze(p):
> > <some code>
> > return res # numpy array of shape (512,50,23)
> > hdf = File('mod_res.hdf','w')
> > s = (2349,512,50,23)
> > ds = hdf.require_dataset('results',shape=s,dtype='f')
> > hdf.flush()
> > rc = Client()
> > lv = rc.load_balanced_view()
> > # pall is an input array of parameters for Analyze, length 2349
> > chunk = 600 # chunk size for splitting up the
> > computation n = len(pall)/chunk + 1 # number of runs
> > for j in range(n):
> > arl =  # asynchronous results list
> > ps = pall[j*chunk:(j+1)*chunk].copy()
> > for i in range(len(ps)):
> > arl.append(lv.apply(Analyze, ps[i]))
> > lv.wait()
> > for i in range(len(arl)):
> > ar = arl[i].get()
> > ds[j*chunk+i] = ar
> > hdf.flush()
> > rc.purge_results('all')
> > hdf.close()
> E-pos vrywaringsklousule
> Hierdie e-pos mag vertroulike inligting bevat en mag regtens
> geprivilegeerd wees en is slegs bedoel vir die persoon aan wie dit
> geadresseer is. Indien u nie die bedoelde ontvanger is nie, word u hiermee
> in kennis gestel dat u hierdie dokument geensins mag gebruik, versprei of
> kopieer nie. Stel ook asseblief die sender onmiddellik per telefoon in
> kennis en vee die e-pos uit. Die Universiteit aanvaar nie aanspreeklikheid
> vir enige skade, verlies of uitgawe wat voortspruit uit hierdie e-pos en/of
> die oopmaak van enige lês aangeheg by hierdie e-pos nie.
> E-mail disclaimer
> This e-mail may contain confidential information and may be legally
> privileged and is intended only for the person to whom it is addressed. If
> you are not the intended recipient, you are notified that you may not use,
> distribute or copy this document in any manner whatsoever. Kindly also
> notify the sender immediately by telephone, and delete the e-mail. The
> University does not accept liability for any damage, loss or expense
> arising from this e-mail and/or accessing any files attached to this e-mail.
> IPython-User mailing list
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the IPython-User