<br><br><div class="gmail_quote">On Mon, Jun 18, 2012 at 1:29 PM, Jon Olav Vik <span dir="ltr"><<a href="mailto:jonovik@gmail.com" target="_blank">jonovik@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="im">MinRK <benjaminrk <at> <a href="http://gmail.com" target="_blank">gmail.com</a>> writes:<br>
<br>
> [IPControllerApp] client::client 'c5dd44d5-b59d-4392-bac0-917e9ef4c9d8'<br>
> requested u'registration_request'<br>
> 2012-06-18 14:03:58.161 [IPClusterStart] Too many open files<br>
> (tcp_listener.cpp:213)<br>
> 2012-06-18 14:03:58.274 [IPClusterStart] Process '.../python' stopped: {'pid':<br>
> 21820, 'exit_code': -6}<br>
> 2012-06-18 14:03:58.275 [IPClusterStart] IPython cluster: stopping<br>
> 2012-06-18 14:03:58.275 [IPClusterStart] Stopping Engines...<br>
> 2012-06-18 14:04:01.281 [IPClusterStart] Removing pid file: .../.ipython/<br>
> profile_default/pid/ipcluster.pid<br>
> The culprit seems to be "Too many open files (tcp_listener.cpp:213)". I would<br>
> like to know where this limit is set, and how to modify it. Also, I wonder if<br>
> it would help to spread connection attempts out in time. That might help if<br>
the<br>
> problem is too many simultaneous requests, but not if the limit applies to how<br>
> many engines I can connect simultaneously. Any other advice would be welcome<br>
> too.<br>
<br>
> This is just the fd limit set by your system. See various docs on changing<br>
'ulimit' for your system.<br>
<br>
</div>fd = file descriptor?</blockquote><div><br></div><div>Yes, sorry, fd is file descriptor. </div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> But the engines are running on separate computers, with<br>
from 8 to 24 cores. Does not the ulimit apply only within each computer?</blockquote><div><br></div><div>There are limits at a few levels, but the one that is relevant here is the *per-process* one, which in your case is 1024. It is only the Controller processes that have a number of FDs proportional to the number of engines, so that's the machine where you need to pay attention to this.</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> Is<br>
there any relationship between tcp and open files? (Sorry, I'm not a native on<br>
Linux.)<br></blockquote><div><br></div><div>Yes - each connection gets a new FD (it's actually a little more complicated than that with zeromq, but it's proportional to new connections).</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
I see the number 200 in "max user processes", but 1024 "max open files". Am I<br>
missing something, e.g. similar limits for network connections?<br></blockquote><div><br></div><div>open files is the limiting factor you want to increase (-n).</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
-bash-3.2$ ulimit -a<br>
core file size (blocks, -c) 0<br>
data seg size (kbytes, -d) unlimited<br>
scheduling priority (-e) 0<br>
file size (blocks, -f) unlimited<br>
pending signals (-i) 135168<br>
max locked memory (kbytes, -l) unlimited<br>
max memory size (kbytes, -m) unlimited<br>
open files (-n) 1024<br>
pipe size (512 bytes, -p) 8<br>
POSIX message queues (bytes, -q) 819200<br>
real-time priority (-r) 0<br>
stack size (kbytes, -s) 10240<br>
cpu time (seconds, -t) unlimited<br>
max user processes (-u) 200<br>
virtual memory (kbytes, -v) 4194304<br>
file locks (-x) unlimited<br>
<div class="im"><br>
<br>
> You can try to spread out connection attempts, but I don't think it will<br>
change anything. <br>
><br>
> I do not believe there are transient sockets during the connection process.<br>
<br>
</div>Does this mean that the number of parallel processes with IPython is limited by<br>
the permitted number of file descriptors?</blockquote><div><br></div><div>Yes, just as any networked process has a limited number of connections.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Two hundred really isn't too much,<br>
but I guess it'll have to do...<br></blockquote><div><br></div><div>This is a result of there being several zeromq connections for each engine, not just one.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
Would it be feasible to have all ipengines on a compute node use a single<br>
connection to the central ipcontroller? (Pardon me if I get the terminology<br>
wrong.)<br></blockquote><div><br></div><div>No, this is not feasible.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
Thank you for your help.<br>
<div class="HOEnZb"><div class="h5"><br>
Best regards,<br>
Jon Olav<br>
<br>
_______________________________________________<br>
IPython-User mailing list<br>
<a href="mailto:IPython-User@scipy.org">IPython-User@scipy.org</a><br>
<a href="http://mail.scipy.org/mailman/listinfo/ipython-user" target="_blank">http://mail.scipy.org/mailman/listinfo/ipython-user</a><br>
</div></div></blockquote></div><br>