[IPython-User] out of memory error

MinRK benjaminrk@gmail....
Tue Jul 10 10:26:28 CDT 2012


The *Controller* is running out of memory, and you are clearing the
cache on the Client side.  If you are not using the extra delayed
result retrieval functionality, you can disable the Hub database,
which should put a stop to Hub memory growth.

You can do this at the command-line with:

ipcontroller --nodb

Or permanently in ipcontroller_config.py:

HubFactory.db_class = 'NoDB'

This is now the default in 0.13, as the db backend is not commonly used.

If you are using the Hub's extra facilities, you can instruct it to
drop results with:

client.purge_results()

-MinRK

On Fri, Jul 6, 2012 at 3:56 PM, Robert Nishihara
<robertnishihara@gmail.com> wrote:
> I'm running multiple trials of the same experiment in a for loop.
>
>     for i in range(10):
>         run_experiment()
>
> It behaves properly for the first several trials. Then it fails with the
> error (this error goes to the controller's standard error)
>
>     MemoryError
>     FATAL ERROR: OUT OF MEMORY (epoll.cpp:57)
>
> I've read this thread
> <http://mail.scipy.org/pipermail/ipython-user/2012-March/009687.html>, and
> so I am already clearing the caches between trials with this subroutine
>
>     def clear_cache(rc, dview):
>         rc.results.clear()
>         rc.metadata.clear()
>         dview.results.clear()
>         assert not rc.outstanding, "don't clear history when tasks are
> outstanding"
>         rc.history = []
>         dview.history = []
>
> But given that the memory error occurs after multiple successful trials, it
> seems like something must be accumulating. Are there other sources of
> caching that I'm missing? Is anything cached on the engines for instance? I
> do not store my results between trials, I use cPickle to dump them to files.
>
> -Robert
>
>
>
>
> The full error from the controller's standard error is included below
> ----------------------------
>
> ERROR:root:Uncaught exception, closing connection.
> Traceback (most recent call last):
>   File
> "/software/linux/x86_64/epd-7.3-1/lib/python2.7/site-packages/zmq/eventloop/zmqstream.py",
> line 391, in _handle_events
>     self._handle_recv()
>   File
> "/software/linux/x86_64/epd-7.3-1/lib/python2.7/site-packages/zmq/eventloop/zmqstream.py",
> line 412, in _handle_recv
>     msg = self.socket.recv_multipart(zmq.NOBLOCK, copy=self._recv_copy)
>   File "socket.pyx", line 723, in zmq.core.socket.Socket.recv_multipart
> (zmq/core/socket.c:6495)
>   File "socket.pyx", line 616, in zmq.core.socket.Socket.recv
> (zmq/core/socket.c:5961)
>   File "socket.pyx", line 650, in zmq.core.socket.Socket.recv
> (zmq/core/socket.c:5832)
>   File "socket.pyx", line 120, in zmq.core.socket._recv_copy
> (zmq/core/socket.c:1681)
>   File "message.pyx", line 75, in zmq.core.message.copy_zmq_msg_bytes
> (zmq/core/message.c:1082)
> MemoryError
> ERROR:root:Exception in I/O handler for fd <zmq.core.socket.Socket object at
> 0x162a6b0>
> Traceback (most recent call last):
>   File
> "/software/linux/x86_64/epd-7.3-1/lib/python2.7/site-packages/zmq/eventloop/ioloop.py",
> line 330, in start
>     self._handlers[fd](fd, events)
>   File
> "/software/linux/x86_64/epd-7.3-1/lib/python2.7/site-packages/zmq/eventloop/zmqstream.py",
> line 391, in _handle_events
>     self._handle_recv()
>   File
> "/software/linux/x86_64/epd-7.3-1/lib/python2.7/site-packages/zmq/eventloop/zmqstream.py",
> line 412, in _handle_recv
>     msg = self.socket.recv_multipart(zmq.NOBLOCK, copy=self._recv_copy)
>   File "socket.pyx", line 723, in zmq.core.socket.Socket.recv_multipart
> (zmq/core/socket.c:6495)
>   File "socket.pyx", line 616, in zmq.core.socket.Socket.recv
> (zmq/core/socket.c:5961)
>   File "socket.pyx", line 650, in zmq.core.socket.Socket.recv
> (zmq/core/socket.c:5832)
>   File "socket.pyx", line 120, in zmq.core.socket._recv_copy
> (zmq/core/socket.c:1681)
>   File "message.pyx", line 75, in zmq.core.message.copy_zmq_msg_bytes
> (zmq/core/message.c:1082)
> MemoryError
> FATAL ERROR: OUT OF MEMORY (epoll.cpp:57)
> /usr/share/gridengine/hpc/spool/cloudcompute-5/job_scripts/1998: line 14:
> 31003 Aborted                 (core dumped) ipcontroller --profile=sge
>
> _______________________________________________
> IPython-User mailing list
> IPython-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-user
>


More information about the IPython-User mailing list