[IPython-User] ipython cluster engines and fd limit
Thu Sep 20 16:05:59 CDT 2012
I just saw the part above about starting the engines manually - since you
are starting 300 of them, I assume you are using some kind of script or
batch system. Can you post the actual code you use?
If it's a simple bash script, you could do something like this to spread
out your engine starts over a particular window:
window=60 # seconds
# sleep for a time up to 60 seconds, determined by the current PID.
sleep $(expr $$ % $window)
You could also use `$RANDOM` instead of `$$`
On Thu, Sep 20, 2012 at 1:54 PM, MinRK <firstname.lastname@example.org> wrote:
> How are you starting the engines (ipcluster, launcher config, etc.)? What
> is your system like (shared filesystem, ssh, nfs, etc.)?
> It's possible there are issues with too many simultaneous connection
> attempts, which could be addressed by adding a delay between each engine
> start (see various `delay` configurables in ipcluster_config.py, depending
> on how you are starting your cluster).
> On Thu, Sep 20, 2012 at 8:01 AM, M. Wimmer <email@example.com> wrote:
>> Just a little update: In fact the problem I described does not seem to
>> anything to do with the file descriptor limit. In /proc/pid/fd I can see
>> there are less than 1024 files open. Actually, when I try to start so many
>> engines that the fd limit is reached, one of the ipcontroller processes
>> (without leaving an error message in the log)
>> I tried to increase the logging messages by adding logging statements
>> myself in
>> the code, but I'm still far from making progress (it's actually quite
>> hard to
>> follow the ipython code logic, as there are many different levels of
>> inheritation - this is not a criticism; I'm just asking for help in this
>> All the best,
>> IPython-User mailing list
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the IPython-User