[IPython-User] client.spin() increasing execution time, and MemoryError

MinRK benjaminrk@gmail....
Wed Mar 14 12:09:44 CDT 2012


On Wed, Mar 14, 2012 at 08:22, RICHARD Georges-Emmanuel
<perspective.electronic@gmail.com> wrote:
> Hi all, Hi Minrk,
>
> I'm not sure if I should post on IPython list or PyZMQ list,

IPython is the right place.

>
> Brief: Usage of IPython parallel, I use DirectView object to "execute" code
> on 32 engines and "pull" to retrieve results.
> Those results are simply compared (means read only) to some range limit to
> know which engine got error and which pass.
> We repeat this, previous result are erase by new, so the memory footprint on
> engines is constant.
> from the IPython Client side, after a long run (it depends on user usage)
> - currently it is not very intensive but as the system is never restarted
> after several weeks it crashed,
> - then it will be something like 1 execute every 1 or 3seconds followed by a
> pull, I expect a crash within few days at this rhythm  :-( .

Wow, that's rough.  I hope we can figure this out.

>
> TestCase:
> -open a shell execute:
>     ipcontroller --ip='*' --nodb
> -then open another shell to start 32 engines,  I use python to do it:
> import time,os
> for i in range(32):
>     os.popen("ipengine&")
>     time.sleep(1)
> -execute the script (README_script_test.py attached to this mail):
>
> When I started to debug and profile IPython internal method, I started to
> insert some time.time() in each method involved in AsyncResult, finally I
> pointed the client.py spin() method,
> as I don't want you to modify your installation, in the script provided I
> define a myClient class which inherit from IPython.parallel.Client and I
> override spin method to add the time.time() that help me to figure out where
> the time is spent.
> Then it's a simple dview.execute to set a variable on all engines, followed
> by a loop to retrieve the variable 1000 of times.
> a boolean WRITECSVFILE, can be set to True to open profiling data in oocalc
> or excel
> a boolean TEST_MEMORY_ERROR, can be set to True, it just do more loops,
> maybe need to be extended to reach the MemoryError but it's probably not
> desired.
>
>
> Issues:
> 1) with a result of type List, while I retrieve it a growing delay appear
> every N ( 4 to 5 in case of 200 elements in list to pull with 32 engines, no
> matters if elements are float only or string only or mixed)
> AsyncResult.get().
>     I did some profiling, this is what the test script help to produce, I
> also modified client.py
>     http://i40.tinypic.com/1z6bpuh.png   or    http://dl.free.fr/bvYnPT3rR
>
>     I managed to notice spin() method of the client.py in which the
> _flush_results(self._mux_socket) seems to be where the time is spent, but
> then I do see the pyzmq interface involved but I don't understand what's
> wrong with it.

I'll have to further investigate this one. I don't see anything
particularly strange in my running of the script.  Can you do some
profiling, to see what operations are actually taking the time?

>
> 2) memory usage keep growing, and finaly reach a MemoryError.
>     no matter the type of data retrieved, numpy array, simple string, float
> or a list.

This one's easy: https://github.com/ipython/ipython/issues/1131

It's the very top of my IPython todo list, but getting it right in
general is a bit tricky with AsyncResults, etc., so I don't have an
implementation yet.  Maybe later today...

The Client caches all results in its `results` dictionary.  These are
not deduplicated with hashes or anything.  Just one object per Result
(note that each pull you are doing produces *32* results).  So by the
end of your script, you have at least 32000 copies of the list you are
pulling.

To clear this dictionary, simply call `client.results.clear()`.

For a full client-side purge:

# clear caches of results and metadata
client.results.clear()
client.metadata.clear()
view.results.clear()

# Also (optionally) clear the history of msg_ids
assert not client.outstanding, "don't clear when tasks are outstanding"
client.history = []
dview.history = []

Using this, I have run large amounts data-moving tasks (1 MB/s on
average) for a couple of days (totaling 100s of GB) without
encountering memory issues.

>
>  comments:
>        concerning 1)  I understand the serialization following the type of
> results, it takes more or less time to retrieve, BUT if the time is rising
> pull after pull on the same result (it seems to be only with List) something
> else is involved, and I don't catch it.
>        concerning 2)  I though the Hub Database could have been involved, so
> when I start the ipcontroller I added --nodb, but I still got the
> MemoryError, moreover this option concerns only the ipcontroller app,
>
> also I tried    rc.purge_results('all'), but didn't see any impact.

purge_results is strictly a Hub DB operation.  Working out exactly how

> are    rc.history,    dview.results,    rc.session.digest_history or other
> internal object should be cleared regularly?
>
> I read this:
>
> http://ipython.org/ipython-doc/stable/parallel/parallel_details.html#non-copying-sends-and-numpy-arrays
>
> What is sendable (...)
>
> (unless the data is very small)      -----> I'am not sure what that means,
> but if a results is pulled twice, how to do to make the 2nd erase the 1st
> one.
>
>
> Question:
> any kind people who want give a try to the script I attached to confirm the
> behaviour, any hint to avoid my IPython client process to slow down, and
> reach a MemoryError, or do I miss a page in the manual about clear the
> memory in the user application?

This should certainly be in the docs, but it probably hasn't been
because I have considered it a bug I should fix, rather than behavior
to be documented.

-MinRK

> in advance Thanks, keep the good works!
>
> Environment:
> Linux RHEL 5.5, python 2.6.6
> ipython 0.12, then upgrade to the latest git 0.13dev
> zmq+pyzmq 2.1.7, then upgrade to the latest zmq+pyzmq 2.1.11
> jsonlib2     1.3.10
>
> Cheers,
>        Joe
>
> --
> RICHARD Georges-Emmanuel
>
>
> _______________________________________________
> IPython-User mailing list
> IPython-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-user
>


More information about the IPython-User mailing list