[IPython-User] ipcluster runs out of memory and can't purge results

Robert Nishihara robertnishihara@gmail....
Fri Dec 14 10:25:43 CST 2012


I think there are a lot of caches, and rc.purge_results('all') doesn't
quite get them all. You might try something like this

    def clear_cache(rc, dview):
        rc.purge_results('all') #clears controller
        rc.results.clear()
        rc.metadata.clear()
        dview.results.clear()
        assert not rc.outstanding, "don't clear history when tasks are
outstanding"
        rc.history = []
        dview.history = []

Not sure if this will solve your problem, but I think I've run into a
similar problem, and I've had some success with this.


On Fri, Dec 14, 2012 at 7:05 AM, Johann Rohwer <jr@sun.ac.za> wrote:

> To follow up on my own post, I have run the script below for a single
> chunk (600 parameter sets). The ipython process then consumes about
> 2.5Gb of memory. I have then deleted the rc, lv, arl, ds and hdf
> objects in the ipython console where I ran the script. I have also
> imported gc and ran gc.collect(). However, the ipython process still
> uses 2.5Gb of memory. Only when I quit the ipython session is the
> memory freed.
>
> (Note that the ipcontroller and ipengine processes are behaving fine.)
>
> I am really puzzled by this...
> --Johann
>
> On Friday 14 December 2012 10:49:16 Johann Rohwer wrote:
> > I'm running a very large computation on a 110-node ipcluster with
> > shared home directory and where the enginges are launched with ssh.
> > Basically it's a total of 2349 runs which each produce a (512,50,23)
> > dataset, so the total resulting data will be on the order of 6Gb.
> > The problem I'm experiencing is that the ipython process that runs
> > the client just keeps increasing in memory and in the end kills the
> > machine (the machine running the controller as well as the client
> > has about 5Gb RAM including swap).
> >
> > To get around this I've chuncked the computation so that the memory
> > copes with one of the chunks. The problem is that the memory does
> > not appear to be freed when rc.purge_results('all') is called (see
> > below) and just keeps increasing in size. I'm using HDF5 for data
> > storage. I've also tried playing around with simple jobs just
> > calling one function and then running
> > rc.purge_results('msg_id_of_async_result'), but the result is still
> > listed in rc.results and does not seem to be cleared. What am I
> > doing wrong? Should I perhaps be deleting the rc object after each
> > chunk and re-instantiating it?
> >
> > The essential points of the code producing the problem are included
> > below.
> >
> > --Johann
> >
> > --------------------------- 8-< ------------------------------------
> > from h5py import File
> > from IPython.parallel import Client
> >
> > def Analyze(p):
> >     <some code>
> >     return res # numpy array of shape (512,50,23)
> >
> > hdf = File('mod_res.hdf','w')
> > s = (2349,512,50,23)
> > ds = hdf.require_dataset('results',shape=s,dtype='f')
> > hdf.flush()
> >
> > rc = Client()
> > lv = rc.load_balanced_view()
> >
> > # pall is an input array of parameters for Analyze, length 2349
> > chunk = 600             # chunk size for splitting up the
> > computation n = len(pall)/chunk + 1   # number of runs
> >
> > for j in range(n):
> >     arl = []  # asynchronous results list
> >     ps = pall[j*chunk:(j+1)*chunk].copy()
> >
> >     for i in range(len(ps)):
> >         arl.append(lv.apply(Analyze, ps[i]))
> >     lv.wait()
> >
> >     for i in range(len(arl)):
> >         ar = arl[i].get()
> >         ds[j*chunk+i] = ar
> >         hdf.flush()
> >
> >     rc.purge_results('all')
> >
> > hdf.close()
> >
>
>
> E-pos vrywaringsklousule
>
> Hierdie e-pos mag vertroulike inligting bevat en mag regtens
> geprivilegeerd wees en is slegs bedoel vir die persoon aan wie dit
> geadresseer is. Indien u nie die bedoelde ontvanger is nie, word u hiermee
> in kennis gestel dat u hierdie dokument geensins mag gebruik, versprei of
> kopieer nie. Stel ook asseblief die sender onmiddellik per telefoon in
> kennis en vee die e-pos uit. Die Universiteit aanvaar nie aanspreeklikheid
> vir enige skade, verlies of uitgawe wat voortspruit uit hierdie e-pos en/of
> die oopmaak van enige lês aangeheg by hierdie e-pos nie.
>
> E-mail disclaimer
>
> This e-mail may contain confidential information and may be legally
> privileged and is intended only for the person to whom it is addressed. If
> you are not the intended recipient, you are notified that you may not use,
> distribute or copy this document in any manner whatsoever. Kindly also
> notify the sender immediately by telephone, and delete the e-mail. The
> University does not accept liability for any damage, loss or expense
> arising from this e-mail and/or accessing any files attached to this e-mail.
>
> _______________________________________________
> IPython-User mailing list
> IPython-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/ipython-user/attachments/20121214/a22262be/attachment.html 


More information about the IPython-User mailing list