The purge_results function fixed my problem. Specifically, I used<div><br></div><div> rc.purge_results('all')</div><div><br></div><div>In case it is helpful to anyone, I have been clearing all caches (that I know of) with the following routine</div>
<div><br></div><div><div> def clear_cache(rc, dview):</div><div> rc.purge_results('all') #clears controller </div>
<div> rc.results.clear()</div><div> rc.metadata.clear()</div><div> dview.results.clear()</div><div> assert not rc.outstanding, "don't clear history when tasks are outstanding"</div>
<div> rc.history = []</div><div> dview.history = []</div><br><div class="gmail_quote">On Tue, Jul 10, 2012 at 8:26 AM, MinRK <span dir="ltr"><<a href="mailto:benjaminrk@gmail.com" target="_blank">benjaminrk@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">The *Controller* is running out of memory, and you are clearing the<br>
cache on the Client side. If you are not using the extra delayed<br>
result retrieval functionality, you can disable the Hub database,<br>
which should put a stop to Hub memory growth.<br>
<br>
You can do this at the command-line with:<br>
<br>
ipcontroller --nodb<br>
<br>
Or permanently in ipcontroller_config.py:<br>
<br>
HubFactory.db_class = 'NoDB'<br>
<br>
This is now the default in 0.13, as the db backend is not commonly used.<br>
<br>
If you are using the Hub's extra facilities, you can instruct it to<br>
drop results with:<br>
<br>
client.purge_results()<br>
<br>
-MinRK<br>
<div><div class="h5"><br>
On Fri, Jul 6, 2012 at 3:56 PM, Robert Nishihara<br>
<<a href="mailto:robertnishihara@gmail.com">robertnishihara@gmail.com</a>> wrote:<br>
> I'm running multiple trials of the same experiment in a for loop.<br>
><br>
> for i in range(10):<br>
> run_experiment()<br>
><br>
> It behaves properly for the first several trials. Then it fails with the<br>
> error (this error goes to the controller's standard error)<br>
><br>
> MemoryError<br>
> FATAL ERROR: OUT OF MEMORY (epoll.cpp:57)<br>
><br>
> I've read this thread<br>
> <<a href="http://mail.scipy.org/pipermail/ipython-user/2012-March/009687.html" target="_blank">http://mail.scipy.org/pipermail/ipython-user/2012-March/009687.html</a>>, and<br>
> so I am already clearing the caches between trials with this subroutine<br>
><br>
> def clear_cache(rc, dview):<br>
> rc.results.clear()<br>
> rc.metadata.clear()<br>
> dview.results.clear()<br>
> assert not rc.outstanding, "don't clear history when tasks are<br>
> outstanding"<br>
> rc.history = []<br>
> dview.history = []<br>
><br>
> But given that the memory error occurs after multiple successful trials, it<br>
> seems like something must be accumulating. Are there other sources of<br>
> caching that I'm missing? Is anything cached on the engines for instance? I<br>
> do not store my results between trials, I use cPickle to dump them to files.<br>
><br>
> -Robert<br>
><br>
><br>
><br>
><br>
> The full error from the controller's standard error is included below<br>
> ----------------------------<br>
><br>
> ERROR:root:Uncaught exception, closing connection.<br>
> Traceback (most recent call last):<br>
> File<br>
> "/software/linux/x86_64/epd-7.3-1/lib/python2.7/site-packages/zmq/eventloop/zmqstream.py",<br>
> line 391, in _handle_events<br>
> self._handle_recv()<br>
> File<br>
> "/software/linux/x86_64/epd-7.3-1/lib/python2.7/site-packages/zmq/eventloop/zmqstream.py",<br>
> line 412, in _handle_recv<br>
> msg = self.socket.recv_multipart(zmq.NOBLOCK, copy=self._recv_copy)<br>
> File "socket.pyx", line 723, in zmq.core.socket.Socket.recv_multipart<br>
> (zmq/core/socket.c:6495)<br>
> File "socket.pyx", line 616, in zmq.core.socket.Socket.recv<br>
> (zmq/core/socket.c:5961)<br>
> File "socket.pyx", line 650, in zmq.core.socket.Socket.recv<br>
> (zmq/core/socket.c:5832)<br>
> File "socket.pyx", line 120, in zmq.core.socket._recv_copy<br>
> (zmq/core/socket.c:1681)<br>
> File "message.pyx", line 75, in zmq.core.message.copy_zmq_msg_bytes<br>
> (zmq/core/message.c:1082)<br>
> MemoryError<br>
> ERROR:root:Exception in I/O handler for fd <zmq.core.socket.Socket object at<br>
> 0x162a6b0><br>
> Traceback (most recent call last):<br>
> File<br>
> "/software/linux/x86_64/epd-7.3-1/lib/python2.7/site-packages/zmq/eventloop/ioloop.py",<br>
> line 330, in start<br>
> self._handlers[fd](fd, events)<br>
> File<br>
> "/software/linux/x86_64/epd-7.3-1/lib/python2.7/site-packages/zmq/eventloop/zmqstream.py",<br>
> line 391, in _handle_events<br>
> self._handle_recv()<br>
> File<br>
> "/software/linux/x86_64/epd-7.3-1/lib/python2.7/site-packages/zmq/eventloop/zmqstream.py",<br>
> line 412, in _handle_recv<br>
> msg = self.socket.recv_multipart(zmq.NOBLOCK, copy=self._recv_copy)<br>
> File "socket.pyx", line 723, in zmq.core.socket.Socket.recv_multipart<br>
> (zmq/core/socket.c:6495)<br>
> File "socket.pyx", line 616, in zmq.core.socket.Socket.recv<br>
> (zmq/core/socket.c:5961)<br>
> File "socket.pyx", line 650, in zmq.core.socket.Socket.recv<br>
> (zmq/core/socket.c:5832)<br>
> File "socket.pyx", line 120, in zmq.core.socket._recv_copy<br>
> (zmq/core/socket.c:1681)<br>
> File "message.pyx", line 75, in zmq.core.message.copy_zmq_msg_bytes<br>
> (zmq/core/message.c:1082)<br>
> MemoryError<br>
> FATAL ERROR: OUT OF MEMORY (epoll.cpp:57)<br>
> /usr/share/gridengine/hpc/spool/cloudcompute-5/job_scripts/1998: line 14:<br>
> 31003 Aborted (core dumped) ipcontroller --profile=sge<br>
><br>
</div></div>> _______________________________________________<br>
> IPython-User mailing list<br>
> <a href="mailto:IPython-User@scipy.org">IPython-User@scipy.org</a><br>
> <a href="http://mail.scipy.org/mailman/listinfo/ipython-user" target="_blank">http://mail.scipy.org/mailman/listinfo/ipython-user</a><br>
><br>
_______________________________________________<br>
IPython-User mailing list<br>
<a href="mailto:IPython-User@scipy.org">IPython-User@scipy.org</a><br>
<a href="http://mail.scipy.org/mailman/listinfo/ipython-user" target="_blank">http://mail.scipy.org/mailman/listinfo/ipython-user</a><br>
</blockquote></div><br></div>