[IPython-User] client.spin() increasing execution time, and MemoryError
Wed Mar 14 10:22:35 CDT 2012
Hi all, Hi Minrk,
I'm not sure if I should post on IPython list or PyZMQ list,
*Brief*: Usage of IPython parallel, I use DirectView object to "execute"
code on 32 engines and "pull" to retrieve results.
Those results are simply compared (means read only) to some range limit
to know which engine got error and which pass.
We repeat this, previous result are erase by new, so the memory
footprint on engines is constant.
from the IPython Client side, after a long run (it depends on user usage)
- currently it is not very intensive but as the system is never
restarted after several weeks it crashed,
- then it will be something like 1 execute every 1 or 3seconds followed
by a pull, I expect a crash within few days at this rhythm :-( .
-open a shell execute:
ipcontroller --ip='*' --nodb
-then open another shell to start 32 engines, I use python to do it:
for i in range(32):
-execute the script (README_script_test.py attached to this mail):
When I started to debug and profile IPython internal method, I started
to insert some time.time() in each method involved in AsyncResult,
finally I pointed the client.py spin() method,
as I don't want you to modify your installation, in the script provided
I define a myClient class which inherit from IPython.parallel.Client and
I override spin method to add the time.time() that help me to figure out
where the time is spent.
Then it's a simple dview.execute to set a variable on all engines,
followed by a loop to retrieve the variable 1000 of times.
a boolean WRITECSVFILE, can be set to True to open profiling data in
oocalc or excel
a boolean TEST_MEMORY_ERROR, can be set to True, it just do more loops,
maybe need to be extended to reach the MemoryError but it's probably not
1) with a result of type List, while I retrieve it a growing delay
appear every N ( 4 to 5 in case of 200 elements in list to pull with 32
engines, no matters if elements are float only or string only or mixed)
I did some profiling, this is what the test script help to produce,
I also modified client.py
http://i40.tinypic.com/1z6bpuh.png or http://dl.free.fr/bvYnPT3rR
I managed to notice spin() method of the client.py in which the
_flush_results(self._mux_socket) seems to be where the time is spent,
but then I do see the pyzmq interface involved but I don't understand
what's wrong with it.
2) memory usage keep growing, and finaly reach a MemoryError.
no matter the type of data retrieved, numpy array, simple string,
float or a list.
concerning 1) I understand the serialization following the type
of results, it takes more or less time to retrieve, BUT if the time is
rising pull after pull on the same result (it seems to be only with
List) something else is involved, and I don't catch it.
concerning 2) I though the Hub Database could have been
involved, so when I start the ipcontroller I added --nodb, but I still
got the MemoryError, moreover this option concerns only the ipcontroller
also I tried rc.purge_results('all'), but didn't see any impact.
are rc.history, dview.results, rc.session.digest_history or
other internal object should be cleared regularly?
I read this:
What is sendable (...)
(unless the data is very small) -----> I'am not sure what that
means, but if a results is pulled twice, how to do to make the 2nd erase
the 1st one.
any kind people who want give a try to the script I attached to confirm
the behaviour, any hint to avoid my IPython client process to slow down,
and reach a MemoryError, or do I miss a page in the manual about clear
the memory in the user application?
in advance Thanks, keep the good works!
Linux RHEL 5.5, python 2.6.6
ipython 0.12, then upgrade to the latest git 0.13dev
zmq+pyzmq 2.1.7, then upgrade to the latest zmq+pyzmq 2.1.11
-------------- next part --------------
An HTML attachment was scrubbed...
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
More information about the IPython-User