[IPython-User] client.spin() increasing execution time, and MemoryError
Wed Mar 14 12:09:44 CDT 2012
On Wed, Mar 14, 2012 at 08:22, RICHARD Georges-Emmanuel
> Hi all, Hi Minrk,
> I'm not sure if I should post on IPython list or PyZMQ list,
IPython is the right place.
> Brief: Usage of IPython parallel, I use DirectView object to "execute" code
> on 32 engines and "pull" to retrieve results.
> Those results are simply compared (means read only) to some range limit to
> know which engine got error and which pass.
> We repeat this, previous result are erase by new, so the memory footprint on
> engines is constant.
> from the IPython Client side, after a long run (it depends on user usage)
> - currently it is not very intensive but as the system is never restarted
> after several weeks it crashed,
> - then it will be something like 1 execute every 1 or 3seconds followed by a
> pull, I expect a crash within few days at this rhythm :-( .
Wow, that's rough. I hope we can figure this out.
> -open a shell execute:
> ipcontroller --ip='*' --nodb
> -then open another shell to start 32 engines, I use python to do it:
> import time,os
> for i in range(32):
> -execute the script (README_script_test.py attached to this mail):
> When I started to debug and profile IPython internal method, I started to
> insert some time.time() in each method involved in AsyncResult, finally I
> pointed the client.py spin() method,
> as I don't want you to modify your installation, in the script provided I
> define a myClient class which inherit from IPython.parallel.Client and I
> override spin method to add the time.time() that help me to figure out where
> the time is spent.
> Then it's a simple dview.execute to set a variable on all engines, followed
> by a loop to retrieve the variable 1000 of times.
> a boolean WRITECSVFILE, can be set to True to open profiling data in oocalc
> or excel
> a boolean TEST_MEMORY_ERROR, can be set to True, it just do more loops,
> maybe need to be extended to reach the MemoryError but it's probably not
> 1) with a result of type List, while I retrieve it a growing delay appear
> every N ( 4 to 5 in case of 200 elements in list to pull with 32 engines, no
> matters if elements are float only or string only or mixed)
> I did some profiling, this is what the test script help to produce, I
> also modified client.py
> http://i40.tinypic.com/1z6bpuh.png or http://dl.free.fr/bvYnPT3rR
> I managed to notice spin() method of the client.py in which the
> _flush_results(self._mux_socket) seems to be where the time is spent, but
> then I do see the pyzmq interface involved but I don't understand what's
> wrong with it.
I'll have to further investigate this one. I don't see anything
particularly strange in my running of the script. Can you do some
profiling, to see what operations are actually taking the time?
> 2) memory usage keep growing, and finaly reach a MemoryError.
> no matter the type of data retrieved, numpy array, simple string, float
> or a list.
This one's easy: https://github.com/ipython/ipython/issues/1131
It's the very top of my IPython todo list, but getting it right in
general is a bit tricky with AsyncResults, etc., so I don't have an
implementation yet. Maybe later today...
The Client caches all results in its `results` dictionary. These are
not deduplicated with hashes or anything. Just one object per Result
(note that each pull you are doing produces *32* results). So by the
end of your script, you have at least 32000 copies of the list you are
To clear this dictionary, simply call `client.results.clear()`.
For a full client-side purge:
# clear caches of results and metadata
# Also (optionally) clear the history of msg_ids
assert not client.outstanding, "don't clear when tasks are outstanding"
client.history = 
dview.history = 
Using this, I have run large amounts data-moving tasks (1 MB/s on
average) for a couple of days (totaling 100s of GB) without
encountering memory issues.
> concerning 1) I understand the serialization following the type of
> results, it takes more or less time to retrieve, BUT if the time is rising
> pull after pull on the same result (it seems to be only with List) something
> else is involved, and I don't catch it.
> concerning 2) I though the Hub Database could have been involved, so
> when I start the ipcontroller I added --nodb, but I still got the
> MemoryError, moreover this option concerns only the ipcontroller app,
> also I tried rc.purge_results('all'), but didn't see any impact.
purge_results is strictly a Hub DB operation. Working out exactly how
> are rc.history, dview.results, rc.session.digest_history or other
> internal object should be cleared regularly?
> I read this:
> What is sendable (...)
> (unless the data is very small) -----> I'am not sure what that means,
> but if a results is pulled twice, how to do to make the 2nd erase the 1st
> any kind people who want give a try to the script I attached to confirm the
> behaviour, any hint to avoid my IPython client process to slow down, and
> reach a MemoryError, or do I miss a page in the manual about clear the
> memory in the user application?
This should certainly be in the docs, but it probably hasn't been
because I have considered it a bug I should fix, rather than behavior
to be documented.
> in advance Thanks, keep the good works!
> Linux RHEL 5.5, python 2.6.6
> ipython 0.12, then upgrade to the latest git 0.13dev
> zmq+pyzmq 2.1.7, then upgrade to the latest zmq+pyzmq 2.1.11
> jsonlib2 1.3.10
> RICHARD Georges-Emmanuel
> IPython-User mailing list
More information about the IPython-User